All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 17:18 ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 17:18 UTC (permalink / raw)
  To: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, linux-security-module,
	kernel-hardening
  Cc: Jonathan Corbet, Ingo Molnar, David S. Miller, netdev,
	Peter Zijlstra, Linus Torvalds, Djalal Harouni

Hi List,

This is v5 of automatic module load restriction series. I have renamed
it to 'Improve Module autoloading infrastructure' for abvious reasons:

* It is more of an infrastructure change now.
* Reduce the use of word 'restriction' and use 'improving' maybe easy
  to sell  ?

These patches are based on next-20171127

Credits:
========
The idea was inspired from grsecurity 'GRKERNSEC_MODHARDEN' config option.
However, upstream Linux implementation is more focused on the run-time
behavior with a three mode switch, plus upstream version solves Linux
usecases with a per process tree flag that can be used in containers,
sandboxes, etc to block direct implicit auto-load operations.
This implementation does not share anything with grsecurity.

The first RFC was as LSM, Kees Cook and others suggested that it can be
turned as a core kernel feature and after some months, they were right.


Previous versions:
==================

v1 RFC as an LSM:
http://www.openwall.com/lists/kernel-hardening/2017/02/02/21

v2 RFC as an LSM:
http://www.openwall.com/lists/kernel-hardening/2017/04/09/1

v3 core feature as requested by kernel maintainers:
https://lkml.org/lkml/2017/4/19/1086

v4:
https://lkml.org/lkml/2017/5/22/312

All previous feedback from: Andy Lutomirski, Solar Designer, Kees Cook,
Rusty Russell, Ben Hutchings and Serge Hallyn is fixed in this series.

Please check Changelog for details. Thank you for the feedback.


==============

Currently, an explicit call to load or unload kernel modules require
CAP_SYS_MODULE capability. However unprivileged users have always been
able to load some modules using the implicit auto-load operation. An
automatic module loading happens when programs request a kernel feature
from a module that is not loaded. In order to satisfy userspace, the
kernel then automatically load all these required modules.

Historically, the kernel was always able to automatically load modules
if they are not blacklisted. This is one of the most important and
transparent operations of Linux, it allows to provide numerous other
features as they are needed which is crucial for a better user experience.
However, as Linux is popular now and used for different appliances some
of these may need to control such operations. For such systems, recent
needs showed that in some cases allowing to control automatic module
loading is as important as the operation itself. Restricting unprivileged
programs or attackers that abuse this feature to load unused modules or
modules that contain bugs is a significant security measure.

This allows administrators or some special programs to have the
appropriate time to update and deny module autoloading in advance, then
blacklist the corresponding ones. Not doing so may affect the global state
of the machine, especially containers where some apps are moved from one
context to another and not having such mechanisms may allow to expose
and exploit the vulnerable parts to escape the container sandbox.

Embedded or IoT devices also started to ship as containers using generic
distros, some vendors do not have the appropriate time to make their own
OS, hence, using base images is getting popular. These setups may include
unnecessary modules that the final applications will not need. Untrusted
access may abuse the module auto-load feature to expose vulnerabilities.

As every code contains bugs or vulnerabilties, the following
vulnerabilities that affected some features that are often compiled as
modules could have been completely blocked, by a better mechanism that
handles module autoloading, especially if the system does not need them.

Past months:
* DCCP use after free CVE-2017-6074 [1] [2]
  Unprivileged to local root.

* XFRM framework CVE-2017-7184 [3]
  As advertised it seems it was used to break Ubuntu on a security
  contest.

* n_hldc CVE-2017-2636 [4] [5]
  Local privilege escalation. 

* L2TPv3 CVE-2016-10200

The list is longer.

To improve the current status, this series tries to re-work how module
autoloading is performed by adding two new properties:
"modules_autoload_mode" sysctl flag, and a per-task one.

The sysctl controls modules auto-load feature and complements
"modules_disabled" which apply to all modules operations. This new flag
allows to control only automatic module loading and if it is allowed or
not, aligning in the process the implicit operation with the explicit one
where both now are covered by capabilities checks.

The three modes that "modules_autoload_mode" support allow to provide
restrictions on automatic module loading without breaking user
experience.

The sysctl flag is available at "/proc/sys/kernel/modules_autoload_mode"

When modules_autoload_mode is set to (0), the default, there are no
restrictions.

When modules_autoload_mode is set to (1), processes must have
CAP_SYS_MODULE to be able to trigger a module auto-load operation,
or CAP_NET_ADMIN for modules with a 'netdev-%s' alias, or other
capabilities for specific aliased modules.

When modules_autoload_mode is set to (2), automatic module loading
is disabled for all.

Notes on relation between "modules_disabled=0" and
"modules_autoload_mode=2":
1) Once "modules_disabled=1" set, it needs a reboot to undo the
setting.

2) Restricting automatic module loading does not interfere with
explicit module load or unload operations.

3) New features provided by modules can be made available without
rebooting the system.

4) A bad version of a module can be unloaded and replaced with a
better one without rebooting the system.

==========================

The patches also support process trees, containers, and sandboxes by
providing an inherited per-task "modules_autoload_mode" flag that cannot be
re-enabled once disabled. This offers the following advantages:

1) Automatic module loading is still available to the rest of the
system.

2) It is easy to use in containers and sandboxes. DCCP example could
have been used to escape containers. The XFRM framework CVE-2017-7184
needs CAP_NET_ADMIN, but attackers may start to target CAP_NET_ADMIN,
a per-task flag will make it harder.

3) Suitable for desktop and more interactive Linux systems.

4) Will allow in future to implement a per user policy.
The user database format is old and not extensible, as discussed maybe
with a modern format we may achieve the following:

User=djalal
NewKernelFeatures=yes

Which means that that interactive user will be allowed to load extra
Linux features. Others, volatile accounts or guests can be easily
blocked from doing so.

5) CAP_NET_ADMIN is useful, it handles lot of operations, at same time it
started to look more like CAP_SYS_ADMIN which is overloaded. We need
CAP_NET_ADMIN, containers need it, but at same time maybe we do not
want programs running with it to load 'netdev-%s' modules. Having an
extra per-task flag allow to discharge a bit CAP_NET_ADMIN and clearly
target automatic module loading operations.


Usage:
        prctl(PR_SET_MODULES_AUTOLOAD_mode, value, 0, 0, 0).

The per-task "modules_autoload_mode" supports the following values:
0       There are no restrictions, usually the default unless set
        by parent.
1       The task must have CAP_SYS_MODULE to be able to trigger a
        module auto-load operation, or CAP_NET_ADMIN for modules with
        a 'netdev-%s' alias.
2       Automatic modules loading is disabled for the current task.

The mode may only be increased, never decreased, thus ensuring that once
applied, processes can never relax their setting. This make it easy for
developers and users to handle.

Note that even if the per-task "modules_autoload_mode" allows to auto-load
the corresponding modules, automatic module loading may still fail due to
the global sysctl "modules_autoload_mode". For more details please see
Documentation/sysctl/kernel.txt, section "modules_autoload_mode".

When a request to a kernel module is denied, the module name with the
corresponding process name and its pid are logged. Administrators can use
such information to explicitly load the appropriate modules.


# Testing:

##) Global sysctl "modules_autoload_mode"

Before patch:
$ lsmod | grep ipip -
$ sudo ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
$ lsmod | grep ipip -
ipip                   16384  0
tunnel4                16384  1 ipip
ip_tunnel              28672  1 ipip
$ cat /proc/sys/kernel/modules_autoload_mode
0

After patch:
$ lsmod | grep ipip -
# echo 2 > /proc/sys/kernel/modules_autoload_mode
$ sudo ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
add tunnel "tunl0" failed: No such device
$ dmesg
...
[ 1876.378389] module: automatic module loading of netdev-tunl0 by "ip"[1453] was denied
[ 1876.380994] module: automatic module loading of tunl0 by "ip"[1453] was denied
...
$ lsmod | grep ipip -


##) Per-task "modules_autoload_mode" flag

Here we use DCCP as an example since the public PoC was against it.

The following tool can be used to test the feature:
https://gist.githubusercontent.com/tixxdz/cf567e4275714199a32c4a80de4ea63a/raw/13e52ea0ee65772871bcf10fb6c94fedd349f5c1/pr_modules_autoload_mode_test.c

DCCP use after free CVE-2017-6074 (unprivileged to local root):
The code path can be triggered by unprivileged, using the trigger.c
program for DCCP use after free [2] and that was fixed by
commit 5edabca9d4cff7f "dccp: fix freeing skb too early for IPV6_RECVPKTINFO".

Before patch:
$ lsmod | grep dccp
$ strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = 3
...
$ lsmod | grep dccp
dccp_ipv6              24576  5
dccp_ipv4              24576  5 dccp_ipv6
dccp                  102400  2 dccp_ipv6,dccp_ipv4
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0


After:
Set task "modules_autoload_mode" to 1, privileged mode.
$ lsmod | grep dccp
$ ./pr_set_no_new_privs
$ grep NoNewPrivs /proc/self/status
NoNewPrivs:     1
$ ./pr_modules_autoload_mode_test 1
$ grep Modules /proc/self/status
ModulesAutoloadMode:    1
$ strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported)
...
$ lsmod | grep dccp
$ dmesg
...
[ 4662.171994] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1759] was denied
[ 4662.177284] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1759] was denied
[ 4662.180181] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1759] was denied
[ 4662.181709] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1759] was denied


Now task "modules_autoload_mode" to 2, disabled mode.
$ lsmod | grep dccp
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0
$ su - root
 # ./pr_modules_autoload_mode_test 2
 # grep Modules /proc/self/status
ModulesAutoloadMode:    2
 # strace ./dccp_trigger

...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported)
...
...
[ 5154.218740] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1873] was denied
[ 5154.219828] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1873] was denied
[ 5154.221814] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1873] was denied
[ 5154.222731] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1873] was denied


As showed, this blocks automatic module loading per-task. This allows to
provide a usable system, where only some sandboxed apps or containers will be
restricted to trigger automatic module loading, other parts of the
system can continue to use the feature as it is which is the case of the
desktop and userfriendly machines.

[1] http://www.openwall.com/lists/oss-security/2017/02/22/3
[2] https://github.com/xairy/kernel-exploits/tree/master/CVE-2017-6074
[3] http://www.openwall.com/lists/oss-security/2017/03/29/2
[4] http://www.openwall.com/lists/oss-security/2017/03/07/6 
[5] https://a13xp0p0v.github.io/2017/03/24/CVE-2017-2636.html


Finally we already have a use case for the prctl() interface to enforce
some systemd services, in docker and other containers, also in some
sandboxes, etc.


# Changes since v4:
*) Removed the property that when the "modules_autoload_mode" sysctl is
   set to "2" disabled mode, then that value is pinned and we can not
   revert it. Now you can undo the value if you have the appropriate
   privileges as it was suggested.

   Suggested-by: Solar Designer <solar@openwall.com>
   Suggested-by: Andy Lutomirski <luto@kernel.org>
   https://lkml.org/lkml/2017/5/22/330

*) Added request_module_cap() to take '@required_cap' and '@prefix'
   arguments that will be used to check if module autoloading is allowed
   or not.

   Suggested-by: Kees Cook <keescook@chromium.org>

*) More cleanups and documentation.


# Changes since v3:
*) Renamed the sysctl from "modules_autoload" to "modules_autoload_mode"
   and the prctl() operation flag to "PR_{SET|GET}_MODULES_AUTOLOAD_MODE"
   as it was requested.

   Suggested-by: Ben Hutchings <ben.hutchings@codethink.co.uk>


*) Updated __request_module() to take the capability that may allow to
   auto-load a module with the appropriate alias. This way we never
   parse aliases as it was requested by Rusty Russell. Security and
   SELinux hooks were updated too.

   Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
   https://lkml.org/lkml/2017/4/24/7


*) Updated code to set prctl(PR_SET_MODULES_AUTOLOAD_MODE, 1, 0, 0, 0),
   the task must call prctl(PR_SET_NO_NEW_PRIVS, 1) before or run with
   CAP_SYS_ADMIN privileges in its namespace. If these are not true,
   -EACCES will be returned.

   Suggested-by: Andy Lutomirski <luto@amacapital.net>
   https://lkml.org/lkml/2017/4/22/22


*) Remove task initialization logic and other cleanups
   Suggested-by: Kees Cook <keescook@chromium.org>


*) Other code and documentation cleanups.
   

# Changes since v2:
*) Implemented as a core kernel feature inside capabilities subsystem
*) Renamed sysctl to "modules_autoload" to align with "modules_disabled"

   Suggested-by: Kees Cook <keescook@chromium.org>

*) Improved documentation.
*) Removed unused code.


# Changes since v1:
*) Renamed module to ModAutoRestrict
*) Improved documentation to explicity refer to module autoloading.
*) Switched to use the new task_security_alloc() hook.
*) Switched from rhash tables to use task->security since it is in
   linux-security/next branch now.
*) Check all parameters passed to prctl() syscall.
*) Many other bug fixes and documentation improvements.


Patches (5) Djalal Harouni:
 (1/5) modules:capabilities: add request_module_cap()
 (2/5) modules:capabilities: add cap_kernel_module_request() permission check
 (3/5) modules:capabilities: automatic module loading restriction
 (4/5) modules:capabilities: add a per-task modules auto-load mode
 (5/5) net: modules: use request_module_cap() to load 'netdev-%s' modules

 Documentation/filesystems/proc.txt                 |   3 +
 Documentation/sysctl/kernel.txt                    |  54 ++++++++
 Documentation/userspace-api/index.rst              |   1 +
 .../userspace-api/modules_autoload_mode.rst        | 116 ++++++++++++++++
 fs/proc/array.c                                    |   6 +
 include/linux/init_task.h                          |   8 ++
 include/linux/kmod.h                               |  65 ++++++++-
 include/linux/lsm_hooks.h                          |   6 +-
 include/linux/module.h                             |  41 +++++-
 include/linux/sched.h                              |   5 +
 include/linux/security.h                           |  11 +-
 include/uapi/linux/prctl.h                         |   8 ++
 kernel/kmod.c                                      |  29 +++-
 kernel/module.c                                    | 153 +++++++++++++++++++++
 kernel/sysctl.c                                    |  28 ++++
 net/core/dev_ioctl.c                               |   4 +-
 security/commoncap.c                               |  62 +++++++++
 security/security.c                                |   6 +-
 security/selinux/hooks.c                           |   3 +-
 19 files changed, 587 insertions(+), 22 deletions(-)

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 17:18 ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 17:18 UTC (permalink / raw)
  To: linux-security-module

Hi List,

This is v5 of automatic module load restriction series. I have renamed
it to 'Improve Module autoloading infrastructure' for abvious reasons:

* It is more of an infrastructure change now.
* Reduce the use of word 'restriction' and use 'improving' maybe easy
  to sell  ?

These patches are based on next-20171127

Credits:
========
The idea was inspired from grsecurity 'GRKERNSEC_MODHARDEN' config option.
However, upstream Linux implementation is more focused on the run-time
behavior with a three mode switch, plus upstream version solves Linux
usecases with a per process tree flag that can be used in containers,
sandboxes, etc to block direct implicit auto-load operations.
This implementation does not share anything with grsecurity.

The first RFC was as LSM, Kees Cook and others suggested that it can be
turned as a core kernel feature and after some months, they were right.


Previous versions:
==================

v1 RFC as an LSM:
http://www.openwall.com/lists/kernel-hardening/2017/02/02/21

v2 RFC as an LSM:
http://www.openwall.com/lists/kernel-hardening/2017/04/09/1

v3 core feature as requested by kernel maintainers:
https://lkml.org/lkml/2017/4/19/1086

v4:
https://lkml.org/lkml/2017/5/22/312

All previous feedback from: Andy Lutomirski, Solar Designer, Kees Cook,
Rusty Russell, Ben Hutchings and Serge Hallyn is fixed in this series.

Please check Changelog for details. Thank you for the feedback.


==============

Currently, an explicit call to load or unload kernel modules require
CAP_SYS_MODULE capability. However unprivileged users have always been
able to load some modules using the implicit auto-load operation. An
automatic module loading happens when programs request a kernel feature
from a module that is not loaded. In order to satisfy userspace, the
kernel then automatically load all these required modules.

Historically, the kernel was always able to automatically load modules
if they are not blacklisted. This is one of the most important and
transparent operations of Linux, it allows to provide numerous other
features as they are needed which is crucial for a better user experience.
However, as Linux is popular now and used for different appliances some
of these may need to control such operations. For such systems, recent
needs showed that in some cases allowing to control automatic module
loading is as important as the operation itself. Restricting unprivileged
programs or attackers that abuse this feature to load unused modules or
modules that contain bugs is a significant security measure.

This allows administrators or some special programs to have the
appropriate time to update and deny module autoloading in advance, then
blacklist the corresponding ones. Not doing so may affect the global state
of the machine, especially containers where some apps are moved from one
context to another and not having such mechanisms may allow to expose
and exploit the vulnerable parts to escape the container sandbox.

Embedded or IoT devices also started to ship as containers using generic
distros, some vendors do not have the appropriate time to make their own
OS, hence, using base images is getting popular. These setups may include
unnecessary modules that the final applications will not need. Untrusted
access may abuse the module auto-load feature to expose vulnerabilities.

As every code contains bugs or vulnerabilties, the following
vulnerabilities that affected some features that are often compiled as
modules could have been completely blocked, by a better mechanism that
handles module autoloading, especially if the system does not need them.

Past months:
* DCCP use after free CVE-2017-6074 [1] [2]
  Unprivileged to local root.

* XFRM framework CVE-2017-7184 [3]
  As advertised it seems it was used to break Ubuntu on a security
  contest.

* n_hldc CVE-2017-2636 [4] [5]
  Local privilege escalation. 

* L2TPv3 CVE-2016-10200

The list is longer.

To improve the current status, this series tries to re-work how module
autoloading is performed by adding two new properties:
"modules_autoload_mode" sysctl flag, and a per-task one.

The sysctl controls modules auto-load feature and complements
"modules_disabled" which apply to all modules operations. This new flag
allows to control only automatic module loading and if it is allowed or
not, aligning in the process the implicit operation with the explicit one
where both now are covered by capabilities checks.

The three modes that "modules_autoload_mode" support allow to provide
restrictions on automatic module loading without breaking user
experience.

The sysctl flag is available at "/proc/sys/kernel/modules_autoload_mode"

When modules_autoload_mode is set to (0), the default, there are no
restrictions.

When modules_autoload_mode is set to (1), processes must have
CAP_SYS_MODULE to be able to trigger a module auto-load operation,
or CAP_NET_ADMIN for modules with a 'netdev-%s' alias, or other
capabilities for specific aliased modules.

When modules_autoload_mode is set to (2), automatic module loading
is disabled for all.

Notes on relation between "modules_disabled=0" and
"modules_autoload_mode=2":
1) Once "modules_disabled=1" set, it needs a reboot to undo the
setting.

2) Restricting automatic module loading does not interfere with
explicit module load or unload operations.

3) New features provided by modules can be made available without
rebooting the system.

4) A bad version of a module can be unloaded and replaced with a
better one without rebooting the system.

==========================

The patches also support process trees, containers, and sandboxes by
providing an inherited per-task "modules_autoload_mode" flag that cannot be
re-enabled once disabled. This offers the following advantages:

1) Automatic module loading is still available to the rest of the
system.

2) It is easy to use in containers and sandboxes. DCCP example could
have been used to escape containers. The XFRM framework CVE-2017-7184
needs CAP_NET_ADMIN, but attackers may start to target CAP_NET_ADMIN,
a per-task flag will make it harder.

3) Suitable for desktop and more interactive Linux systems.

4) Will allow in future to implement a per user policy.
The user database format is old and not extensible, as discussed maybe
with a modern format we may achieve the following:

User=djalal
NewKernelFeatures=yes

Which means that that interactive user will be allowed to load extra
Linux features. Others, volatile accounts or guests can be easily
blocked from doing so.

5) CAP_NET_ADMIN is useful, it handles lot of operations, at same time it
started to look more like CAP_SYS_ADMIN which is overloaded. We need
CAP_NET_ADMIN, containers need it, but at same time maybe we do not
want programs running with it to load 'netdev-%s' modules. Having an
extra per-task flag allow to discharge a bit CAP_NET_ADMIN and clearly
target automatic module loading operations.


Usage:
        prctl(PR_SET_MODULES_AUTOLOAD_mode, value, 0, 0, 0).

The per-task "modules_autoload_mode" supports the following values:
0       There are no restrictions, usually the default unless set
        by parent.
1       The task must have CAP_SYS_MODULE to be able to trigger a
        module auto-load operation, or CAP_NET_ADMIN for modules with
        a 'netdev-%s' alias.
2       Automatic modules loading is disabled for the current task.

The mode may only be increased, never decreased, thus ensuring that once
applied, processes can never relax their setting. This make it easy for
developers and users to handle.

Note that even if the per-task "modules_autoload_mode" allows to auto-load
the corresponding modules, automatic module loading may still fail due to
the global sysctl "modules_autoload_mode". For more details please see
Documentation/sysctl/kernel.txt, section "modules_autoload_mode".

When a request to a kernel module is denied, the module name with the
corresponding process name and its pid are logged. Administrators can use
such information to explicitly load the appropriate modules.


# Testing:

##) Global sysctl "modules_autoload_mode"

Before patch:
$ lsmod | grep ipip -
$ sudo ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
$ lsmod | grep ipip -
ipip                   16384  0
tunnel4                16384  1 ipip
ip_tunnel              28672  1 ipip
$ cat /proc/sys/kernel/modules_autoload_mode
0

After patch:
$ lsmod | grep ipip -
# echo 2 > /proc/sys/kernel/modules_autoload_mode
$ sudo ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
add tunnel "tunl0" failed: No such device
$ dmesg
...
[ 1876.378389] module: automatic module loading of netdev-tunl0 by "ip"[1453] was denied
[ 1876.380994] module: automatic module loading of tunl0 by "ip"[1453] was denied
...
$ lsmod | grep ipip -


##) Per-task "modules_autoload_mode" flag

Here we use DCCP as an example since the public PoC was against it.

The following tool can be used to test the feature:
https://gist.githubusercontent.com/tixxdz/cf567e4275714199a32c4a80de4ea63a/raw/13e52ea0ee65772871bcf10fb6c94fedd349f5c1/pr_modules_autoload_mode_test.c

DCCP use after free CVE-2017-6074 (unprivileged to local root):
The code path can be triggered by unprivileged, using the trigger.c
program for DCCP use after free [2] and that was fixed by
commit 5edabca9d4cff7f "dccp: fix freeing skb too early for IPV6_RECVPKTINFO".

Before patch:
$ lsmod | grep dccp
$ strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = 3
...
$ lsmod | grep dccp
dccp_ipv6              24576  5
dccp_ipv4              24576  5 dccp_ipv6
dccp                  102400  2 dccp_ipv6,dccp_ipv4
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0


After:
Set task "modules_autoload_mode" to 1, privileged mode.
$ lsmod | grep dccp
$ ./pr_set_no_new_privs
$ grep NoNewPrivs /proc/self/status
NoNewPrivs:     1
$ ./pr_modules_autoload_mode_test 1
$ grep Modules /proc/self/status
ModulesAutoloadMode:    1
$ strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported)
...
$ lsmod | grep dccp
$ dmesg
...
[ 4662.171994] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1759] was denied
[ 4662.177284] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1759] was denied
[ 4662.180181] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1759] was denied
[ 4662.181709] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1759] was denied


Now task "modules_autoload_mode" to 2, disabled mode.
$ lsmod | grep dccp
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0
$ su - root
 # ./pr_modules_autoload_mode_test 2
 # grep Modules /proc/self/status
ModulesAutoloadMode:    2
 # strace ./dccp_trigger

...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported)
...
...
[ 5154.218740] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1873] was denied
[ 5154.219828] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1873] was denied
[ 5154.221814] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1873] was denied
[ 5154.222731] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1873] was denied


As showed, this blocks automatic module loading per-task. This allows to
provide a usable system, where only some sandboxed apps or containers will be
restricted to trigger automatic module loading, other parts of the
system can continue to use the feature as it is which is the case of the
desktop and userfriendly machines.

[1] http://www.openwall.com/lists/oss-security/2017/02/22/3
[2] https://github.com/xairy/kernel-exploits/tree/master/CVE-2017-6074
[3] http://www.openwall.com/lists/oss-security/2017/03/29/2
[4] http://www.openwall.com/lists/oss-security/2017/03/07/6 
[5] https://a13xp0p0v.github.io/2017/03/24/CVE-2017-2636.html


Finally we already have a use case for the prctl() interface to enforce
some systemd services, in docker and other containers, also in some
sandboxes, etc.


# Changes since v4:
*) Removed the property that when the "modules_autoload_mode" sysctl is
   set to "2" disabled mode, then that value is pinned and we can not
   revert it. Now you can undo the value if you have the appropriate
   privileges as it was suggested.

   Suggested-by: Solar Designer <solar@openwall.com>
   Suggested-by: Andy Lutomirski <luto@kernel.org>
   https://lkml.org/lkml/2017/5/22/330

*) Added request_module_cap() to take '@required_cap' and '@prefix'
   arguments that will be used to check if module autoloading is allowed
   or not.

   Suggested-by: Kees Cook <keescook@chromium.org>

*) More cleanups and documentation.


# Changes since v3:
*) Renamed the sysctl from "modules_autoload" to "modules_autoload_mode"
   and the prctl() operation flag to "PR_{SET|GET}_MODULES_AUTOLOAD_MODE"
   as it was requested.

   Suggested-by: Ben Hutchings <ben.hutchings@codethink.co.uk>


*) Updated __request_module() to take the capability that may allow to
   auto-load a module with the appropriate alias. This way we never
   parse aliases as it was requested by Rusty Russell. Security and
   SELinux hooks were updated too.

   Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
   https://lkml.org/lkml/2017/4/24/7


*) Updated code to set prctl(PR_SET_MODULES_AUTOLOAD_MODE, 1, 0, 0, 0),
   the task must call prctl(PR_SET_NO_NEW_PRIVS, 1) before or run with
   CAP_SYS_ADMIN privileges in its namespace. If these are not true,
   -EACCES will be returned.

   Suggested-by: Andy Lutomirski <luto@amacapital.net>
   https://lkml.org/lkml/2017/4/22/22


*) Remove task initialization logic and other cleanups
   Suggested-by: Kees Cook <keescook@chromium.org>


*) Other code and documentation cleanups.
   

# Changes since v2:
*) Implemented as a core kernel feature inside capabilities subsystem
*) Renamed sysctl to "modules_autoload" to align with "modules_disabled"

   Suggested-by: Kees Cook <keescook@chromium.org>

*) Improved documentation.
*) Removed unused code.


# Changes since v1:
*) Renamed module to ModAutoRestrict
*) Improved documentation to explicity refer to module autoloading.
*) Switched to use the new task_security_alloc() hook.
*) Switched from rhash tables to use task->security since it is in
   linux-security/next branch now.
*) Check all parameters passed to prctl() syscall.
*) Many other bug fixes and documentation improvements.


Patches (5) Djalal Harouni:
 (1/5) modules:capabilities: add request_module_cap()
 (2/5) modules:capabilities: add cap_kernel_module_request() permission check
 (3/5) modules:capabilities: automatic module loading restriction
 (4/5) modules:capabilities: add a per-task modules auto-load mode
 (5/5) net: modules: use request_module_cap() to load 'netdev-%s' modules

 Documentation/filesystems/proc.txt                 |   3 +
 Documentation/sysctl/kernel.txt                    |  54 ++++++++
 Documentation/userspace-api/index.rst              |   1 +
 .../userspace-api/modules_autoload_mode.rst        | 116 ++++++++++++++++
 fs/proc/array.c                                    |   6 +
 include/linux/init_task.h                          |   8 ++
 include/linux/kmod.h                               |  65 ++++++++-
 include/linux/lsm_hooks.h                          |   6 +-
 include/linux/module.h                             |  41 +++++-
 include/linux/sched.h                              |   5 +
 include/linux/security.h                           |  11 +-
 include/uapi/linux/prctl.h                         |   8 ++
 kernel/kmod.c                                      |  29 +++-
 kernel/module.c                                    | 153 +++++++++++++++++++++
 kernel/sysctl.c                                    |  28 ++++
 net/core/dev_ioctl.c                               |   4 +-
 security/commoncap.c                               |  62 +++++++++
 security/security.c                                |   6 +-
 security/selinux/hooks.c                           |   3 +-
 19 files changed, 587 insertions(+), 22 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 17:18 ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 17:18 UTC (permalink / raw)
  To: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, linux-security-module,
	kernel-hardening
  Cc: Jonathan Corbet, Ingo Molnar, David S. Miller, netdev,
	Peter Zijlstra, Linus Torvalds, Djalal Harouni

Hi List,

This is v5 of automatic module load restriction series. I have renamed
it to 'Improve Module autoloading infrastructure' for abvious reasons:

* It is more of an infrastructure change now.
* Reduce the use of word 'restriction' and use 'improving' maybe easy
  to sell  ?

These patches are based on next-20171127

Credits:
========
The idea was inspired from grsecurity 'GRKERNSEC_MODHARDEN' config option.
However, upstream Linux implementation is more focused on the run-time
behavior with a three mode switch, plus upstream version solves Linux
usecases with a per process tree flag that can be used in containers,
sandboxes, etc to block direct implicit auto-load operations.
This implementation does not share anything with grsecurity.

The first RFC was as LSM, Kees Cook and others suggested that it can be
turned as a core kernel feature and after some months, they were right.


Previous versions:
==================

v1 RFC as an LSM:
http://www.openwall.com/lists/kernel-hardening/2017/02/02/21

v2 RFC as an LSM:
http://www.openwall.com/lists/kernel-hardening/2017/04/09/1

v3 core feature as requested by kernel maintainers:
https://lkml.org/lkml/2017/4/19/1086

v4:
https://lkml.org/lkml/2017/5/22/312

All previous feedback from: Andy Lutomirski, Solar Designer, Kees Cook,
Rusty Russell, Ben Hutchings and Serge Hallyn is fixed in this series.

Please check Changelog for details. Thank you for the feedback.


==============

Currently, an explicit call to load or unload kernel modules require
CAP_SYS_MODULE capability. However unprivileged users have always been
able to load some modules using the implicit auto-load operation. An
automatic module loading happens when programs request a kernel feature
from a module that is not loaded. In order to satisfy userspace, the
kernel then automatically load all these required modules.

Historically, the kernel was always able to automatically load modules
if they are not blacklisted. This is one of the most important and
transparent operations of Linux, it allows to provide numerous other
features as they are needed which is crucial for a better user experience.
However, as Linux is popular now and used for different appliances some
of these may need to control such operations. For such systems, recent
needs showed that in some cases allowing to control automatic module
loading is as important as the operation itself. Restricting unprivileged
programs or attackers that abuse this feature to load unused modules or
modules that contain bugs is a significant security measure.

This allows administrators or some special programs to have the
appropriate time to update and deny module autoloading in advance, then
blacklist the corresponding ones. Not doing so may affect the global state
of the machine, especially containers where some apps are moved from one
context to another and not having such mechanisms may allow to expose
and exploit the vulnerable parts to escape the container sandbox.

Embedded or IoT devices also started to ship as containers using generic
distros, some vendors do not have the appropriate time to make their own
OS, hence, using base images is getting popular. These setups may include
unnecessary modules that the final applications will not need. Untrusted
access may abuse the module auto-load feature to expose vulnerabilities.

As every code contains bugs or vulnerabilties, the following
vulnerabilities that affected some features that are often compiled as
modules could have been completely blocked, by a better mechanism that
handles module autoloading, especially if the system does not need them.

Past months:
* DCCP use after free CVE-2017-6074 [1] [2]
  Unprivileged to local root.

* XFRM framework CVE-2017-7184 [3]
  As advertised it seems it was used to break Ubuntu on a security
  contest.

* n_hldc CVE-2017-2636 [4] [5]
  Local privilege escalation. 

* L2TPv3 CVE-2016-10200

The list is longer.

To improve the current status, this series tries to re-work how module
autoloading is performed by adding two new properties:
"modules_autoload_mode" sysctl flag, and a per-task one.

The sysctl controls modules auto-load feature and complements
"modules_disabled" which apply to all modules operations. This new flag
allows to control only automatic module loading and if it is allowed or
not, aligning in the process the implicit operation with the explicit one
where both now are covered by capabilities checks.

The three modes that "modules_autoload_mode" support allow to provide
restrictions on automatic module loading without breaking user
experience.

The sysctl flag is available at "/proc/sys/kernel/modules_autoload_mode"

When modules_autoload_mode is set to (0), the default, there are no
restrictions.

When modules_autoload_mode is set to (1), processes must have
CAP_SYS_MODULE to be able to trigger a module auto-load operation,
or CAP_NET_ADMIN for modules with a 'netdev-%s' alias, or other
capabilities for specific aliased modules.

When modules_autoload_mode is set to (2), automatic module loading
is disabled for all.

Notes on relation between "modules_disabled=0" and
"modules_autoload_mode=2":
1) Once "modules_disabled=1" set, it needs a reboot to undo the
setting.

2) Restricting automatic module loading does not interfere with
explicit module load or unload operations.

3) New features provided by modules can be made available without
rebooting the system.

4) A bad version of a module can be unloaded and replaced with a
better one without rebooting the system.

==========================

The patches also support process trees, containers, and sandboxes by
providing an inherited per-task "modules_autoload_mode" flag that cannot be
re-enabled once disabled. This offers the following advantages:

1) Automatic module loading is still available to the rest of the
system.

2) It is easy to use in containers and sandboxes. DCCP example could
have been used to escape containers. The XFRM framework CVE-2017-7184
needs CAP_NET_ADMIN, but attackers may start to target CAP_NET_ADMIN,
a per-task flag will make it harder.

3) Suitable for desktop and more interactive Linux systems.

4) Will allow in future to implement a per user policy.
The user database format is old and not extensible, as discussed maybe
with a modern format we may achieve the following:

User=djalal
NewKernelFeatures=yes

Which means that that interactive user will be allowed to load extra
Linux features. Others, volatile accounts or guests can be easily
blocked from doing so.

5) CAP_NET_ADMIN is useful, it handles lot of operations, at same time it
started to look more like CAP_SYS_ADMIN which is overloaded. We need
CAP_NET_ADMIN, containers need it, but at same time maybe we do not
want programs running with it to load 'netdev-%s' modules. Having an
extra per-task flag allow to discharge a bit CAP_NET_ADMIN and clearly
target automatic module loading operations.


Usage:
        prctl(PR_SET_MODULES_AUTOLOAD_mode, value, 0, 0, 0).

The per-task "modules_autoload_mode" supports the following values:
0       There are no restrictions, usually the default unless set
        by parent.
1       The task must have CAP_SYS_MODULE to be able to trigger a
        module auto-load operation, or CAP_NET_ADMIN for modules with
        a 'netdev-%s' alias.
2       Automatic modules loading is disabled for the current task.

The mode may only be increased, never decreased, thus ensuring that once
applied, processes can never relax their setting. This make it easy for
developers and users to handle.

Note that even if the per-task "modules_autoload_mode" allows to auto-load
the corresponding modules, automatic module loading may still fail due to
the global sysctl "modules_autoload_mode". For more details please see
Documentation/sysctl/kernel.txt, section "modules_autoload_mode".

When a request to a kernel module is denied, the module name with the
corresponding process name and its pid are logged. Administrators can use
such information to explicitly load the appropriate modules.


# Testing:

##) Global sysctl "modules_autoload_mode"

Before patch:
$ lsmod | grep ipip -
$ sudo ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
$ lsmod | grep ipip -
ipip                   16384  0
tunnel4                16384  1 ipip
ip_tunnel              28672  1 ipip
$ cat /proc/sys/kernel/modules_autoload_mode
0

After patch:
$ lsmod | grep ipip -
# echo 2 > /proc/sys/kernel/modules_autoload_mode
$ sudo ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
add tunnel "tunl0" failed: No such device
$ dmesg
...
[ 1876.378389] module: automatic module loading of netdev-tunl0 by "ip"[1453] was denied
[ 1876.380994] module: automatic module loading of tunl0 by "ip"[1453] was denied
...
$ lsmod | grep ipip -


##) Per-task "modules_autoload_mode" flag

Here we use DCCP as an example since the public PoC was against it.

The following tool can be used to test the feature:
https://gist.githubusercontent.com/tixxdz/cf567e4275714199a32c4a80de4ea63a/raw/13e52ea0ee65772871bcf10fb6c94fedd349f5c1/pr_modules_autoload_mode_test.c

DCCP use after free CVE-2017-6074 (unprivileged to local root):
The code path can be triggered by unprivileged, using the trigger.c
program for DCCP use after free [2] and that was fixed by
commit 5edabca9d4cff7f "dccp: fix freeing skb too early for IPV6_RECVPKTINFO".

Before patch:
$ lsmod | grep dccp
$ strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = 3
...
$ lsmod | grep dccp
dccp_ipv6              24576  5
dccp_ipv4              24576  5 dccp_ipv6
dccp                  102400  2 dccp_ipv6,dccp_ipv4
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0


After:
Set task "modules_autoload_mode" to 1, privileged mode.
$ lsmod | grep dccp
$ ./pr_set_no_new_privs
$ grep NoNewPrivs /proc/self/status
NoNewPrivs:     1
$ ./pr_modules_autoload_mode_test 1
$ grep Modules /proc/self/status
ModulesAutoloadMode:    1
$ strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported)
...
$ lsmod | grep dccp
$ dmesg
...
[ 4662.171994] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1759] was denied
[ 4662.177284] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1759] was denied
[ 4662.180181] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1759] was denied
[ 4662.181709] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1759] was denied


Now task "modules_autoload_mode" to 2, disabled mode.
$ lsmod | grep dccp
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0
$ su - root
 # ./pr_modules_autoload_mode_test 2
 # grep Modules /proc/self/status
ModulesAutoloadMode:    2
 # strace ./dccp_trigger

...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported)
...
...
[ 5154.218740] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1873] was denied
[ 5154.219828] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1873] was denied
[ 5154.221814] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1873] was denied
[ 5154.222731] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1873] was denied


As showed, this blocks automatic module loading per-task. This allows to
provide a usable system, where only some sandboxed apps or containers will be
restricted to trigger automatic module loading, other parts of the
system can continue to use the feature as it is which is the case of the
desktop and userfriendly machines.

[1] http://www.openwall.com/lists/oss-security/2017/02/22/3
[2] https://github.com/xairy/kernel-exploits/tree/master/CVE-2017-6074
[3] http://www.openwall.com/lists/oss-security/2017/03/29/2
[4] http://www.openwall.com/lists/oss-security/2017/03/07/6 
[5] https://a13xp0p0v.github.io/2017/03/24/CVE-2017-2636.html


Finally we already have a use case for the prctl() interface to enforce
some systemd services, in docker and other containers, also in some
sandboxes, etc.


# Changes since v4:
*) Removed the property that when the "modules_autoload_mode" sysctl is
   set to "2" disabled mode, then that value is pinned and we can not
   revert it. Now you can undo the value if you have the appropriate
   privileges as it was suggested.

   Suggested-by: Solar Designer <solar@openwall.com>
   Suggested-by: Andy Lutomirski <luto@kernel.org>
   https://lkml.org/lkml/2017/5/22/330

*) Added request_module_cap() to take '@required_cap' and '@prefix'
   arguments that will be used to check if module autoloading is allowed
   or not.

   Suggested-by: Kees Cook <keescook@chromium.org>

*) More cleanups and documentation.


# Changes since v3:
*) Renamed the sysctl from "modules_autoload" to "modules_autoload_mode"
   and the prctl() operation flag to "PR_{SET|GET}_MODULES_AUTOLOAD_MODE"
   as it was requested.

   Suggested-by: Ben Hutchings <ben.hutchings@codethink.co.uk>


*) Updated __request_module() to take the capability that may allow to
   auto-load a module with the appropriate alias. This way we never
   parse aliases as it was requested by Rusty Russell. Security and
   SELinux hooks were updated too.

   Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
   https://lkml.org/lkml/2017/4/24/7


*) Updated code to set prctl(PR_SET_MODULES_AUTOLOAD_MODE, 1, 0, 0, 0),
   the task must call prctl(PR_SET_NO_NEW_PRIVS, 1) before or run with
   CAP_SYS_ADMIN privileges in its namespace. If these are not true,
   -EACCES will be returned.

   Suggested-by: Andy Lutomirski <luto@amacapital.net>
   https://lkml.org/lkml/2017/4/22/22


*) Remove task initialization logic and other cleanups
   Suggested-by: Kees Cook <keescook@chromium.org>


*) Other code and documentation cleanups.
   

# Changes since v2:
*) Implemented as a core kernel feature inside capabilities subsystem
*) Renamed sysctl to "modules_autoload" to align with "modules_disabled"

   Suggested-by: Kees Cook <keescook@chromium.org>

*) Improved documentation.
*) Removed unused code.


# Changes since v1:
*) Renamed module to ModAutoRestrict
*) Improved documentation to explicity refer to module autoloading.
*) Switched to use the new task_security_alloc() hook.
*) Switched from rhash tables to use task->security since it is in
   linux-security/next branch now.
*) Check all parameters passed to prctl() syscall.
*) Many other bug fixes and documentation improvements.


Patches (5) Djalal Harouni:
 (1/5) modules:capabilities: add request_module_cap()
 (2/5) modules:capabilities: add cap_kernel_module_request() permission check
 (3/5) modules:capabilities: automatic module loading restriction
 (4/5) modules:capabilities: add a per-task modules auto-load mode
 (5/5) net: modules: use request_module_cap() to load 'netdev-%s' modules

 Documentation/filesystems/proc.txt                 |   3 +
 Documentation/sysctl/kernel.txt                    |  54 ++++++++
 Documentation/userspace-api/index.rst              |   1 +
 .../userspace-api/modules_autoload_mode.rst        | 116 ++++++++++++++++
 fs/proc/array.c                                    |   6 +
 include/linux/init_task.h                          |   8 ++
 include/linux/kmod.h                               |  65 ++++++++-
 include/linux/lsm_hooks.h                          |   6 +-
 include/linux/module.h                             |  41 +++++-
 include/linux/sched.h                              |   5 +
 include/linux/security.h                           |  11 +-
 include/uapi/linux/prctl.h                         |   8 ++
 kernel/kmod.c                                      |  29 +++-
 kernel/module.c                                    | 153 +++++++++++++++++++++
 kernel/sysctl.c                                    |  28 ++++
 net/core/dev_ioctl.c                               |   4 +-
 security/commoncap.c                               |  62 +++++++++
 security/security.c                                |   6 +-
 security/selinux/hooks.c                           |   3 +-
 19 files changed, 587 insertions(+), 22 deletions(-)

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-27 17:18 ` Djalal Harouni
  (?)
@ 2017-11-27 17:18   ` Djalal Harouni
  -1 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 17:18 UTC (permalink / raw)
  To: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, linux-security-module,
	kernel-hardening
  Cc: Jonathan Corbet, Ingo Molnar, David S. Miller, netdev,
	Peter Zijlstra, Linus Torvalds, Djalal Harouni

This is a preparation patch to improve the module auto-load
infrastructure.

We need this patch to have more control on module auto-load operations.
The operation by default is allowed unless enduser or the calling code
requests that we need to perform futher permission checks.

With this change subsystems will be able to decide if module auto-load
feature first will have to do a capability check and load the module if
the permission check succeeds or deny the operation.

As an example "netdev-%s" modules, they are allowed to be loaded if
CAP_NET_ADMIN is set. Therefore, in order to not break this assumption,
and allow userspace to load "netdev-%s" modules with CAP_NET_ADMIN,
we have added:

        request_module_cap(required_cap, prefix, fmt...)

This new function will take:
'@required_cap': Required capability to load the module
'@prefix': The module prefix if any, otherwise NULL
'@fmt': printf style format string for the name of the module with its
        arguments if any

ex:
        request_module_cap(CAP_NET_ADMIN, "netdev", "%s", mod);

After a discussion with Rusty Russell [1], the suggestion was to pass
the capability from request_module() to security_kernel_module_request()
for 'netdev-%s' modules that need CAP_NET_ADMIN, and after review from
Kees Cook [2] and experimenting with the code, the patch now does the
following:

* Adds the request_module_cap() function.
* Updates the __request_module() to take the "required_cap" argument
        with the "prefix".

This patch also updates SELinux which is currently the only user of
security_kernel_module_request(), the security hook now accepts
'required_cap' and 'prefix' as arguments.

Based on patch by Rusty Russell and discussion with Kees Cook:
[1] https://lkml.org/lkml/2017/4/26/735
[2] https://lkml.org/lkml/2017/5/23/775

Cc: Serge Hallyn <serge@hallyn.com>
Cc: Andy Lutomirski <luto@kernel.org>
Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
---
 include/linux/kmod.h      | 65 ++++++++++++++++++++++++++++++++++++++++++-----
 include/linux/lsm_hooks.h |  6 ++++-
 include/linux/security.h  |  7 +++--
 kernel/kmod.c             | 29 ++++++++++++++++-----
 security/security.c       |  6 +++--
 security/selinux/hooks.c  |  3 ++-
 6 files changed, 97 insertions(+), 19 deletions(-)

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index 40c89ad..ccd6a1c 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -33,16 +33,67 @@
 extern char modprobe_path[]; /* for sysctl */
 /* modprobe exit status on success, -ve on error.  Return value
  * usually useless though. */
-extern __printf(2, 3)
-int __request_module(bool wait, const char *name, ...);
-#define request_module(mod...) __request_module(true, mod)
-#define request_module_nowait(mod...) __request_module(false, mod)
+extern __printf(4, 5)
+int __request_module(bool wait, int required_cap,
+		     const char *prefix, const char *name, ...);
 #define try_then_request_module(x, mod...) \
-	((x) ?: (__request_module(true, mod), (x)))
+	((x) ?: (__request_module(true, -1, NULL, mod), (x)))
 #else
-static inline int request_module(const char *name, ...) { return -ENOSYS; }
-static inline int request_module_nowait(const char *name, ...) { return -ENOSYS; }
+static inline __printf(4, 5)
+int __request_module(bool wait, int required_cap,
+		     const char *prefix, const char *name, ...)
+{ return -ENOSYS; }
 #define try_then_request_module(x, mod...) (x)
 #endif
 
+/**
+ * request_module  Try to load a kernel module
+ *
+ * Automatically loads the request module.
+ *
+ * @mod...: The module name
+ */
+#define request_module(mod...) __request_module(true, -1, NULL, mod)
+
+#define request_module_nowait(mod...) __request_module(false, -1, NULL, mod)
+
+/**
+ * request_module_cap  Load kernel module only if the required capability is set
+ *
+ * Automatically load a module if the required capability is set and it
+ * corresponds to the appropriate subsystem that is indicated by prefix.
+ *
+ * This allows to load aliased modules like 'netdev-%s' with CAP_NET_ADMIN.
+ *
+ * ex:
+ *	request_module_cap(CAP_NET_ADMIN, "netdev", "%s", mod);
+ *
+ * @required_cap: Required capability to load the module
+ * @prefix: The module prefix if any, otherwise NULL
+ * @fmt: printf style format string for the name of the module with its
+ *       arguments if any
+ *
+ * If '@required_cap' is positive, the security subsystem will check if
+ * '@prefix' is set and if caller has the required capability then the
+ * operation is allowed.
+ * The security subsystem can not make assumption about the boundaries
+ * of other subsystems, it is their responsability to make a call with
+ * the right capability and module alias.
+ *
+ * If '@required_cap' is positive and '@prefix' is NULL then we assume
+ * that the '@required_cap' is CAP_SYS_MODULE.
+ *
+ * If '@required_cap' is negative then there are no permission checks, this
+ * is the equivalent to request_module() function.
+ *
+ * This function trust callers to pass the right capability with the
+ * appropriate prefix.
+ *
+ * Note: the permission checks may still fail, even if the required
+ * capability is negative, this is due to module loading restrictions
+ * that are controlled by the enduser.
+ */
+#define request_module_cap(required_cap, prefix, fmt...) \
+	__request_module(true, required_cap, prefix, fmt)
+
 #endif /* __LINUX_KMOD_H__ */
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 7161d8e..d898bd0 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -571,6 +571,9 @@
  *	Ability to trigger the kernel to automatically upcall to userspace for
  *	userspace to load a kernel module with the given name.
  *	@kmod_name name of the module requested by the kernel
+ *	@required_cap If positive, the required capability to automatically load
+ *	the correspondig kernel module.
+ *	@prefix The prefix of the module if any. Can be NULL.
  *	Return 0 if successful.
  * @kernel_read_file:
  *	Read a file specified by userspace.
@@ -1543,7 +1546,8 @@ union security_list_options {
 	void (*cred_transfer)(struct cred *new, const struct cred *old);
 	int (*kernel_act_as)(struct cred *new, u32 secid);
 	int (*kernel_create_files_as)(struct cred *new, struct inode *inode);
-	int (*kernel_module_request)(char *kmod_name);
+	int (*kernel_module_request)(char *kmod_name, int required_cap,
+				     const char *prefix);
 	int (*kernel_read_file)(struct file *file, enum kernel_read_file_id id);
 	int (*kernel_post_read_file)(struct file *file, char *buf, loff_t size,
 				     enum kernel_read_file_id id);
diff --git a/include/linux/security.h b/include/linux/security.h
index 73f1ef6..41e700a 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -326,7 +326,8 @@ int security_prepare_creds(struct cred *new, const struct cred *old, gfp_t gfp);
 void security_transfer_creds(struct cred *new, const struct cred *old);
 int security_kernel_act_as(struct cred *new, u32 secid);
 int security_kernel_create_files_as(struct cred *new, struct inode *inode);
-int security_kernel_module_request(char *kmod_name);
+int security_kernel_module_request(char *kmod_name, int required_cap,
+				   const char *prefix);
 int security_kernel_read_file(struct file *file, enum kernel_read_file_id id);
 int security_kernel_post_read_file(struct file *file, char *buf, loff_t size,
 				   enum kernel_read_file_id id);
@@ -919,7 +920,9 @@ static inline int security_kernel_create_files_as(struct cred *cred,
 	return 0;
 }
 
-static inline int security_kernel_module_request(char *kmod_name)
+static inline int security_kernel_module_request(char *kmod_name,
+						 int required_cap,
+						 const char *prefix)
 {
 	return 0;
 }
diff --git a/kernel/kmod.c b/kernel/kmod.c
index bc6addd..679d401 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -109,6 +109,8 @@ static int call_modprobe(char *module_name, int wait)
 /**
  * __request_module - try to load a kernel module
  * @wait: wait (or not) for the operation to complete
+ * @required_cap: required capability to load the module
+ * @prefix: module prefix if any otherwise NULL
  * @fmt: printf style format string for the name of the module
  * @...: arguments as specified in the format string
  *
@@ -119,14 +121,20 @@ static int call_modprobe(char *module_name, int wait)
  * must check that the service they requested is now available not blindly
  * invoke it.
  *
- * If module auto-loading support is disabled then this function
- * becomes a no-operation.
+ * If "required_cap" is positive, The security subsystem will trust the caller
+ * that if it has "required_cap" then it may allow to load some modules that
+ * have a specific alias.
+ *
+ * If module auto-loading support is disabled by clearing the modprobe path
+ * then this function becomes a no-operation.
  */
-int __request_module(bool wait, const char *fmt, ...)
+int __request_module(bool wait, int required_cap,
+		     const char *prefix, const char *fmt, ...)
 {
 	va_list args;
 	char module_name[MODULE_NAME_LEN];
 	int ret;
+	int len = 0;
 
 	/*
 	 * We don't allow synchronous module loading from async.  Module
@@ -139,13 +147,22 @@ int __request_module(bool wait, const char *fmt, ...)
 	if (!modprobe_path[0])
 		return 0;
 
+	/*
+	 * Lets attach the prefix to the module name
+	 */
+	if (prefix != NULL && *prefix != '\0') {
+		len += snprintf(module_name, MODULE_NAME_LEN, "%s-", prefix);
+		if (len >= MODULE_NAME_LEN)
+			return -ENAMETOOLONG;
+	}
+
 	va_start(args, fmt);
-	ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
+	ret = vsnprintf(module_name + len, MODULE_NAME_LEN - len, fmt, args);
 	va_end(args);
-	if (ret >= MODULE_NAME_LEN)
+	if (ret >= MODULE_NAME_LEN - len)
 		return -ENAMETOOLONG;
 
-	ret = security_kernel_module_request(module_name);
+	ret = security_kernel_module_request(module_name, required_cap, prefix);
 	if (ret)
 		return ret;
 
diff --git a/security/security.c b/security/security.c
index 1cd8526..91ecebd 100644
--- a/security/security.c
+++ b/security/security.c
@@ -1015,9 +1015,11 @@ int security_kernel_create_files_as(struct cred *new, struct inode *inode)
 	return call_int_hook(kernel_create_files_as, 0, new, inode);
 }
 
-int security_kernel_module_request(char *kmod_name)
+int security_kernel_module_request(char *kmod_name, int required_cap,
+				   const char *prefix)
 {
-	return call_int_hook(kernel_module_request, 0, kmod_name);
+	return call_int_hook(kernel_module_request, 0, kmod_name,
+			     required_cap, prefix);
 }
 
 int security_kernel_read_file(struct file *file, enum kernel_read_file_id id)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index d07299d..69f25da 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3889,7 +3889,8 @@ static int selinux_kernel_create_files_as(struct cred *new, struct inode *inode)
 	return ret;
 }
 
-static int selinux_kernel_module_request(char *kmod_name)
+static int selinux_kernel_module_request(char *kmod_name, int required_cap,
+					 const char *prefix)
 {
 	struct common_audit_data ad;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-27 17:18   ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 17:18 UTC (permalink / raw)
  To: linux-security-module

This is a preparation patch to improve the module auto-load
infrastructure.

We need this patch to have more control on module auto-load operations.
The operation by default is allowed unless enduser or the calling code
requests that we need to perform futher permission checks.

With this change subsystems will be able to decide if module auto-load
feature first will have to do a capability check and load the module if
the permission check succeeds or deny the operation.

As an example "netdev-%s" modules, they are allowed to be loaded if
CAP_NET_ADMIN is set. Therefore, in order to not break this assumption,
and allow userspace to load "netdev-%s" modules with CAP_NET_ADMIN,
we have added:

        request_module_cap(required_cap, prefix, fmt...)

This new function will take:
'@required_cap': Required capability to load the module
'@prefix': The module prefix if any, otherwise NULL
'@fmt': printf style format string for the name of the module with its
        arguments if any

ex:
        request_module_cap(CAP_NET_ADMIN, "netdev", "%s", mod);

After a discussion with Rusty Russell [1], the suggestion was to pass
the capability from request_module() to security_kernel_module_request()
for 'netdev-%s' modules that need CAP_NET_ADMIN, and after review from
Kees Cook [2] and experimenting with the code, the patch now does the
following:

* Adds the request_module_cap() function.
* Updates the __request_module() to take the "required_cap" argument
        with the "prefix".

This patch also updates SELinux which is currently the only user of
security_kernel_module_request(), the security hook now accepts
'required_cap' and 'prefix' as arguments.

Based on patch by Rusty Russell and discussion with Kees Cook:
[1] https://lkml.org/lkml/2017/4/26/735
[2] https://lkml.org/lkml/2017/5/23/775

Cc: Serge Hallyn <serge@hallyn.com>
Cc: Andy Lutomirski <luto@kernel.org>
Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
---
 include/linux/kmod.h      | 65 ++++++++++++++++++++++++++++++++++++++++++-----
 include/linux/lsm_hooks.h |  6 ++++-
 include/linux/security.h  |  7 +++--
 kernel/kmod.c             | 29 ++++++++++++++++-----
 security/security.c       |  6 +++--
 security/selinux/hooks.c  |  3 ++-
 6 files changed, 97 insertions(+), 19 deletions(-)

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index 40c89ad..ccd6a1c 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -33,16 +33,67 @@
 extern char modprobe_path[]; /* for sysctl */
 /* modprobe exit status on success, -ve on error.  Return value
  * usually useless though. */
-extern __printf(2, 3)
-int __request_module(bool wait, const char *name, ...);
-#define request_module(mod...) __request_module(true, mod)
-#define request_module_nowait(mod...) __request_module(false, mod)
+extern __printf(4, 5)
+int __request_module(bool wait, int required_cap,
+		     const char *prefix, const char *name, ...);
 #define try_then_request_module(x, mod...) \
-	((x) ?: (__request_module(true, mod), (x)))
+	((x) ?: (__request_module(true, -1, NULL, mod), (x)))
 #else
-static inline int request_module(const char *name, ...) { return -ENOSYS; }
-static inline int request_module_nowait(const char *name, ...) { return -ENOSYS; }
+static inline __printf(4, 5)
+int __request_module(bool wait, int required_cap,
+		     const char *prefix, const char *name, ...)
+{ return -ENOSYS; }
 #define try_then_request_module(x, mod...) (x)
 #endif
 
+/**
+ * request_module  Try to load a kernel module
+ *
+ * Automatically loads the request module.
+ *
+ * @mod...: The module name
+ */
+#define request_module(mod...) __request_module(true, -1, NULL, mod)
+
+#define request_module_nowait(mod...) __request_module(false, -1, NULL, mod)
+
+/**
+ * request_module_cap  Load kernel module only if the required capability is set
+ *
+ * Automatically load a module if the required capability is set and it
+ * corresponds to the appropriate subsystem that is indicated by prefix.
+ *
+ * This allows to load aliased modules like 'netdev-%s' with CAP_NET_ADMIN.
+ *
+ * ex:
+ *	request_module_cap(CAP_NET_ADMIN, "netdev", "%s", mod);
+ *
+ * @required_cap: Required capability to load the module
+ * @prefix: The module prefix if any, otherwise NULL
+ * @fmt: printf style format string for the name of the module with its
+ *       arguments if any
+ *
+ * If '@required_cap' is positive, the security subsystem will check if
+ * '@prefix' is set and if caller has the required capability then the
+ * operation is allowed.
+ * The security subsystem can not make assumption about the boundaries
+ * of other subsystems, it is their responsability to make a call with
+ * the right capability and module alias.
+ *
+ * If '@required_cap' is positive and '@prefix' is NULL then we assume
+ * that the '@required_cap' is CAP_SYS_MODULE.
+ *
+ * If '@required_cap' is negative then there are no permission checks, this
+ * is the equivalent to request_module() function.
+ *
+ * This function trust callers to pass the right capability with the
+ * appropriate prefix.
+ *
+ * Note: the permission checks may still fail, even if the required
+ * capability is negative, this is due to module loading restrictions
+ * that are controlled by the enduser.
+ */
+#define request_module_cap(required_cap, prefix, fmt...) \
+	__request_module(true, required_cap, prefix, fmt)
+
 #endif /* __LINUX_KMOD_H__ */
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 7161d8e..d898bd0 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -571,6 +571,9 @@
  *	Ability to trigger the kernel to automatically upcall to userspace for
  *	userspace to load a kernel module with the given name.
  *	@kmod_name name of the module requested by the kernel
+ *	@required_cap If positive, the required capability to automatically load
+ *	the correspondig kernel module.
+ *	@prefix The prefix of the module if any. Can be NULL.
  *	Return 0 if successful.
  * @kernel_read_file:
  *	Read a file specified by userspace.
@@ -1543,7 +1546,8 @@ union security_list_options {
 	void (*cred_transfer)(struct cred *new, const struct cred *old);
 	int (*kernel_act_as)(struct cred *new, u32 secid);
 	int (*kernel_create_files_as)(struct cred *new, struct inode *inode);
-	int (*kernel_module_request)(char *kmod_name);
+	int (*kernel_module_request)(char *kmod_name, int required_cap,
+				     const char *prefix);
 	int (*kernel_read_file)(struct file *file, enum kernel_read_file_id id);
 	int (*kernel_post_read_file)(struct file *file, char *buf, loff_t size,
 				     enum kernel_read_file_id id);
diff --git a/include/linux/security.h b/include/linux/security.h
index 73f1ef6..41e700a 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -326,7 +326,8 @@ int security_prepare_creds(struct cred *new, const struct cred *old, gfp_t gfp);
 void security_transfer_creds(struct cred *new, const struct cred *old);
 int security_kernel_act_as(struct cred *new, u32 secid);
 int security_kernel_create_files_as(struct cred *new, struct inode *inode);
-int security_kernel_module_request(char *kmod_name);
+int security_kernel_module_request(char *kmod_name, int required_cap,
+				   const char *prefix);
 int security_kernel_read_file(struct file *file, enum kernel_read_file_id id);
 int security_kernel_post_read_file(struct file *file, char *buf, loff_t size,
 				   enum kernel_read_file_id id);
@@ -919,7 +920,9 @@ static inline int security_kernel_create_files_as(struct cred *cred,
 	return 0;
 }
 
-static inline int security_kernel_module_request(char *kmod_name)
+static inline int security_kernel_module_request(char *kmod_name,
+						 int required_cap,
+						 const char *prefix)
 {
 	return 0;
 }
diff --git a/kernel/kmod.c b/kernel/kmod.c
index bc6addd..679d401 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -109,6 +109,8 @@ static int call_modprobe(char *module_name, int wait)
 /**
  * __request_module - try to load a kernel module
  * @wait: wait (or not) for the operation to complete
+ * @required_cap: required capability to load the module
+ * @prefix: module prefix if any otherwise NULL
  * @fmt: printf style format string for the name of the module
  * @...: arguments as specified in the format string
  *
@@ -119,14 +121,20 @@ static int call_modprobe(char *module_name, int wait)
  * must check that the service they requested is now available not blindly
  * invoke it.
  *
- * If module auto-loading support is disabled then this function
- * becomes a no-operation.
+ * If "required_cap" is positive, The security subsystem will trust the caller
+ * that if it has "required_cap" then it may allow to load some modules that
+ * have a specific alias.
+ *
+ * If module auto-loading support is disabled by clearing the modprobe path
+ * then this function becomes a no-operation.
  */
-int __request_module(bool wait, const char *fmt, ...)
+int __request_module(bool wait, int required_cap,
+		     const char *prefix, const char *fmt, ...)
 {
 	va_list args;
 	char module_name[MODULE_NAME_LEN];
 	int ret;
+	int len = 0;
 
 	/*
 	 * We don't allow synchronous module loading from async.  Module
@@ -139,13 +147,22 @@ int __request_module(bool wait, const char *fmt, ...)
 	if (!modprobe_path[0])
 		return 0;
 
+	/*
+	 * Lets attach the prefix to the module name
+	 */
+	if (prefix != NULL && *prefix != '\0') {
+		len += snprintf(module_name, MODULE_NAME_LEN, "%s-", prefix);
+		if (len >= MODULE_NAME_LEN)
+			return -ENAMETOOLONG;
+	}
+
 	va_start(args, fmt);
-	ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
+	ret = vsnprintf(module_name + len, MODULE_NAME_LEN - len, fmt, args);
 	va_end(args);
-	if (ret >= MODULE_NAME_LEN)
+	if (ret >= MODULE_NAME_LEN - len)
 		return -ENAMETOOLONG;
 
-	ret = security_kernel_module_request(module_name);
+	ret = security_kernel_module_request(module_name, required_cap, prefix);
 	if (ret)
 		return ret;
 
diff --git a/security/security.c b/security/security.c
index 1cd8526..91ecebd 100644
--- a/security/security.c
+++ b/security/security.c
@@ -1015,9 +1015,11 @@ int security_kernel_create_files_as(struct cred *new, struct inode *inode)
 	return call_int_hook(kernel_create_files_as, 0, new, inode);
 }
 
-int security_kernel_module_request(char *kmod_name)
+int security_kernel_module_request(char *kmod_name, int required_cap,
+				   const char *prefix)
 {
-	return call_int_hook(kernel_module_request, 0, kmod_name);
+	return call_int_hook(kernel_module_request, 0, kmod_name,
+			     required_cap, prefix);
 }
 
 int security_kernel_read_file(struct file *file, enum kernel_read_file_id id)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index d07299d..69f25da 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3889,7 +3889,8 @@ static int selinux_kernel_create_files_as(struct cred *new, struct inode *inode)
 	return ret;
 }
 
-static int selinux_kernel_module_request(char *kmod_name)
+static int selinux_kernel_module_request(char *kmod_name, int required_cap,
+					 const char *prefix)
 {
 	struct common_audit_data ad;
 
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 266+ messages in thread

* [kernel-hardening] [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-27 17:18   ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 17:18 UTC (permalink / raw)
  To: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, linux-security-module,
	kernel-hardening
  Cc: Jonathan Corbet, Ingo Molnar, David S. Miller, netdev,
	Peter Zijlstra, Linus Torvalds, Djalal Harouni

This is a preparation patch to improve the module auto-load
infrastructure.

We need this patch to have more control on module auto-load operations.
The operation by default is allowed unless enduser or the calling code
requests that we need to perform futher permission checks.

With this change subsystems will be able to decide if module auto-load
feature first will have to do a capability check and load the module if
the permission check succeeds or deny the operation.

As an example "netdev-%s" modules, they are allowed to be loaded if
CAP_NET_ADMIN is set. Therefore, in order to not break this assumption,
and allow userspace to load "netdev-%s" modules with CAP_NET_ADMIN,
we have added:

        request_module_cap(required_cap, prefix, fmt...)

This new function will take:
'@required_cap': Required capability to load the module
'@prefix': The module prefix if any, otherwise NULL
'@fmt': printf style format string for the name of the module with its
        arguments if any

ex:
        request_module_cap(CAP_NET_ADMIN, "netdev", "%s", mod);

After a discussion with Rusty Russell [1], the suggestion was to pass
the capability from request_module() to security_kernel_module_request()
for 'netdev-%s' modules that need CAP_NET_ADMIN, and after review from
Kees Cook [2] and experimenting with the code, the patch now does the
following:

* Adds the request_module_cap() function.
* Updates the __request_module() to take the "required_cap" argument
        with the "prefix".

This patch also updates SELinux which is currently the only user of
security_kernel_module_request(), the security hook now accepts
'required_cap' and 'prefix' as arguments.

Based on patch by Rusty Russell and discussion with Kees Cook:
[1] https://lkml.org/lkml/2017/4/26/735
[2] https://lkml.org/lkml/2017/5/23/775

Cc: Serge Hallyn <serge@hallyn.com>
Cc: Andy Lutomirski <luto@kernel.org>
Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
---
 include/linux/kmod.h      | 65 ++++++++++++++++++++++++++++++++++++++++++-----
 include/linux/lsm_hooks.h |  6 ++++-
 include/linux/security.h  |  7 +++--
 kernel/kmod.c             | 29 ++++++++++++++++-----
 security/security.c       |  6 +++--
 security/selinux/hooks.c  |  3 ++-
 6 files changed, 97 insertions(+), 19 deletions(-)

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index 40c89ad..ccd6a1c 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -33,16 +33,67 @@
 extern char modprobe_path[]; /* for sysctl */
 /* modprobe exit status on success, -ve on error.  Return value
  * usually useless though. */
-extern __printf(2, 3)
-int __request_module(bool wait, const char *name, ...);
-#define request_module(mod...) __request_module(true, mod)
-#define request_module_nowait(mod...) __request_module(false, mod)
+extern __printf(4, 5)
+int __request_module(bool wait, int required_cap,
+		     const char *prefix, const char *name, ...);
 #define try_then_request_module(x, mod...) \
-	((x) ?: (__request_module(true, mod), (x)))
+	((x) ?: (__request_module(true, -1, NULL, mod), (x)))
 #else
-static inline int request_module(const char *name, ...) { return -ENOSYS; }
-static inline int request_module_nowait(const char *name, ...) { return -ENOSYS; }
+static inline __printf(4, 5)
+int __request_module(bool wait, int required_cap,
+		     const char *prefix, const char *name, ...)
+{ return -ENOSYS; }
 #define try_then_request_module(x, mod...) (x)
 #endif
 
+/**
+ * request_module  Try to load a kernel module
+ *
+ * Automatically loads the request module.
+ *
+ * @mod...: The module name
+ */
+#define request_module(mod...) __request_module(true, -1, NULL, mod)
+
+#define request_module_nowait(mod...) __request_module(false, -1, NULL, mod)
+
+/**
+ * request_module_cap  Load kernel module only if the required capability is set
+ *
+ * Automatically load a module if the required capability is set and it
+ * corresponds to the appropriate subsystem that is indicated by prefix.
+ *
+ * This allows to load aliased modules like 'netdev-%s' with CAP_NET_ADMIN.
+ *
+ * ex:
+ *	request_module_cap(CAP_NET_ADMIN, "netdev", "%s", mod);
+ *
+ * @required_cap: Required capability to load the module
+ * @prefix: The module prefix if any, otherwise NULL
+ * @fmt: printf style format string for the name of the module with its
+ *       arguments if any
+ *
+ * If '@required_cap' is positive, the security subsystem will check if
+ * '@prefix' is set and if caller has the required capability then the
+ * operation is allowed.
+ * The security subsystem can not make assumption about the boundaries
+ * of other subsystems, it is their responsability to make a call with
+ * the right capability and module alias.
+ *
+ * If '@required_cap' is positive and '@prefix' is NULL then we assume
+ * that the '@required_cap' is CAP_SYS_MODULE.
+ *
+ * If '@required_cap' is negative then there are no permission checks, this
+ * is the equivalent to request_module() function.
+ *
+ * This function trust callers to pass the right capability with the
+ * appropriate prefix.
+ *
+ * Note: the permission checks may still fail, even if the required
+ * capability is negative, this is due to module loading restrictions
+ * that are controlled by the enduser.
+ */
+#define request_module_cap(required_cap, prefix, fmt...) \
+	__request_module(true, required_cap, prefix, fmt)
+
 #endif /* __LINUX_KMOD_H__ */
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 7161d8e..d898bd0 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -571,6 +571,9 @@
  *	Ability to trigger the kernel to automatically upcall to userspace for
  *	userspace to load a kernel module with the given name.
  *	@kmod_name name of the module requested by the kernel
+ *	@required_cap If positive, the required capability to automatically load
+ *	the correspondig kernel module.
+ *	@prefix The prefix of the module if any. Can be NULL.
  *	Return 0 if successful.
  * @kernel_read_file:
  *	Read a file specified by userspace.
@@ -1543,7 +1546,8 @@ union security_list_options {
 	void (*cred_transfer)(struct cred *new, const struct cred *old);
 	int (*kernel_act_as)(struct cred *new, u32 secid);
 	int (*kernel_create_files_as)(struct cred *new, struct inode *inode);
-	int (*kernel_module_request)(char *kmod_name);
+	int (*kernel_module_request)(char *kmod_name, int required_cap,
+				     const char *prefix);
 	int (*kernel_read_file)(struct file *file, enum kernel_read_file_id id);
 	int (*kernel_post_read_file)(struct file *file, char *buf, loff_t size,
 				     enum kernel_read_file_id id);
diff --git a/include/linux/security.h b/include/linux/security.h
index 73f1ef6..41e700a 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -326,7 +326,8 @@ int security_prepare_creds(struct cred *new, const struct cred *old, gfp_t gfp);
 void security_transfer_creds(struct cred *new, const struct cred *old);
 int security_kernel_act_as(struct cred *new, u32 secid);
 int security_kernel_create_files_as(struct cred *new, struct inode *inode);
-int security_kernel_module_request(char *kmod_name);
+int security_kernel_module_request(char *kmod_name, int required_cap,
+				   const char *prefix);
 int security_kernel_read_file(struct file *file, enum kernel_read_file_id id);
 int security_kernel_post_read_file(struct file *file, char *buf, loff_t size,
 				   enum kernel_read_file_id id);
@@ -919,7 +920,9 @@ static inline int security_kernel_create_files_as(struct cred *cred,
 	return 0;
 }
 
-static inline int security_kernel_module_request(char *kmod_name)
+static inline int security_kernel_module_request(char *kmod_name,
+						 int required_cap,
+						 const char *prefix)
 {
 	return 0;
 }
diff --git a/kernel/kmod.c b/kernel/kmod.c
index bc6addd..679d401 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -109,6 +109,8 @@ static int call_modprobe(char *module_name, int wait)
 /**
  * __request_module - try to load a kernel module
  * @wait: wait (or not) for the operation to complete
+ * @required_cap: required capability to load the module
+ * @prefix: module prefix if any otherwise NULL
  * @fmt: printf style format string for the name of the module
  * @...: arguments as specified in the format string
  *
@@ -119,14 +121,20 @@ static int call_modprobe(char *module_name, int wait)
  * must check that the service they requested is now available not blindly
  * invoke it.
  *
- * If module auto-loading support is disabled then this function
- * becomes a no-operation.
+ * If "required_cap" is positive, The security subsystem will trust the caller
+ * that if it has "required_cap" then it may allow to load some modules that
+ * have a specific alias.
+ *
+ * If module auto-loading support is disabled by clearing the modprobe path
+ * then this function becomes a no-operation.
  */
-int __request_module(bool wait, const char *fmt, ...)
+int __request_module(bool wait, int required_cap,
+		     const char *prefix, const char *fmt, ...)
 {
 	va_list args;
 	char module_name[MODULE_NAME_LEN];
 	int ret;
+	int len = 0;
 
 	/*
 	 * We don't allow synchronous module loading from async.  Module
@@ -139,13 +147,22 @@ int __request_module(bool wait, const char *fmt, ...)
 	if (!modprobe_path[0])
 		return 0;
 
+	/*
+	 * Lets attach the prefix to the module name
+	 */
+	if (prefix != NULL && *prefix != '\0') {
+		len += snprintf(module_name, MODULE_NAME_LEN, "%s-", prefix);
+		if (len >= MODULE_NAME_LEN)
+			return -ENAMETOOLONG;
+	}
+
 	va_start(args, fmt);
-	ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
+	ret = vsnprintf(module_name + len, MODULE_NAME_LEN - len, fmt, args);
 	va_end(args);
-	if (ret >= MODULE_NAME_LEN)
+	if (ret >= MODULE_NAME_LEN - len)
 		return -ENAMETOOLONG;
 
-	ret = security_kernel_module_request(module_name);
+	ret = security_kernel_module_request(module_name, required_cap, prefix);
 	if (ret)
 		return ret;
 
diff --git a/security/security.c b/security/security.c
index 1cd8526..91ecebd 100644
--- a/security/security.c
+++ b/security/security.c
@@ -1015,9 +1015,11 @@ int security_kernel_create_files_as(struct cred *new, struct inode *inode)
 	return call_int_hook(kernel_create_files_as, 0, new, inode);
 }
 
-int security_kernel_module_request(char *kmod_name)
+int security_kernel_module_request(char *kmod_name, int required_cap,
+				   const char *prefix)
 {
-	return call_int_hook(kernel_module_request, 0, kmod_name);
+	return call_int_hook(kernel_module_request, 0, kmod_name,
+			     required_cap, prefix);
 }
 
 int security_kernel_read_file(struct file *file, enum kernel_read_file_id id)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index d07299d..69f25da 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3889,7 +3889,8 @@ static int selinux_kernel_create_files_as(struct cred *new, struct inode *inode)
 	return ret;
 }
 
-static int selinux_kernel_module_request(char *kmod_name)
+static int selinux_kernel_module_request(char *kmod_name, int required_cap,
+					 const char *prefix)
 {
 	struct common_audit_data ad;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 266+ messages in thread

* [PATCH v5 next 2/5] modules:capabilities: add cap_kernel_module_request() permission check
  2017-11-27 17:18 ` Djalal Harouni
  (?)
@ 2017-11-27 17:18   ` Djalal Harouni
  -1 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 17:18 UTC (permalink / raw)
  To: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, linux-security-module,
	kernel-hardening
  Cc: Jonathan Corbet, Ingo Molnar, David S. Miller, netdev,
	Peter Zijlstra, Linus Torvalds, Djalal Harouni

This is a preparation patch to improve for the module auto-load
infrastrucutre.

With this change, subsystems that want to autoload modules and implement
onsite capability checks, can defer the checks to the capability
subsystem by passing the required capabilities with the appropriate
modules alias. The capability subsystem will trust callers about
the passed values and perform a capability check to either allow module
auto-loading or deny it.

This patch changes:
* Adds cap_kernel_module_request() capability hook.
* Adds an empty may_autoload_module() that will be updated in the next
  patch.

Cc: James Morris <james.l.morris@oracle.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
---
 include/linux/module.h   | 10 ++++++++++
 include/linux/security.h |  4 +++-
 kernel/module.c          | 23 +++++++++++++++++++++++
 security/commoncap.c     | 26 ++++++++++++++++++++++++++
 4 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/include/linux/module.h b/include/linux/module.h
index c69b49a..5cbb239 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -497,6 +497,10 @@ bool __is_module_percpu_address(unsigned long addr, unsigned long *can_addr);
 bool is_module_percpu_address(unsigned long addr);
 bool is_module_text_address(unsigned long addr);
 
+/* Determine whether a module auto-load operation is permitted. */
+int may_autoload_module(char *kmod_name, int required_cap,
+			const char *kmod_prefix);
+
 static inline bool within_module_core(unsigned long addr,
 				      const struct module *mod)
 {
@@ -643,6 +647,12 @@ bool is_module_sig_enforced(void);
 
 #else /* !CONFIG_MODULES... */
 
+static inline int may_autoload_module(char *kmod_name, int required_cap,
+				      const char *kmod_prefix)
+{
+	return -ENOSYS;
+}
+
 static inline struct module *__module_address(unsigned long addr)
 {
 	return NULL;
diff --git a/include/linux/security.h b/include/linux/security.h
index 41e700a..9bb53b5 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -102,6 +102,8 @@ extern int cap_task_setscheduler(struct task_struct *p);
 extern int cap_task_setioprio(struct task_struct *p, int ioprio);
 extern int cap_task_setnice(struct task_struct *p, int nice);
 extern int cap_vm_enough_memory(struct mm_struct *mm, long pages);
+extern int cap_kernel_module_request(char *kmod_name, int required_cap,
+				     const char *kmod_prefix);
 
 struct msghdr;
 struct sk_buff;
@@ -924,7 +926,7 @@ static inline int security_kernel_module_request(char *kmod_name,
 						 int required_cap,
 						 const char *prefix)
 {
-	return 0;
+	return cap_kernel_module_request(kmod_name, required_cap, prefix);
 }
 
 static inline int security_kernel_read_file(struct file *file,
diff --git a/kernel/module.c b/kernel/module.c
index f0411a2..3380d39 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -4340,6 +4340,29 @@ struct module *__module_text_address(unsigned long addr)
 }
 EXPORT_SYMBOL_GPL(__module_text_address);
 
+/**
+ * may_autoload_module - Determine whether a module auto-load operation
+ * is permitted
+ * @kmod_name: The module name
+ * @required_cap: if positive, may allow to auto-load the module if this
+ *	capability is set
+ * @kmod_prefix: The module prefix if any, otherwise NULL
+ *
+ * Determine whether a module auto-load operation is allowed or not.
+ *
+ * This allows to have more control on automatic module loading, and align it
+ * with explicit load/unload module operations. The kernel contains several
+ * modules, some of them are not updated often and may contain bugs and
+ * vulnerabilities.
+ *
+ * Returns 0 if the module request is allowed or -EPERM if not.
+ */
+int may_autoload_module(char *kmod_name, int required_cap,
+			const char *kmod_prefix)
+{
+	return 0;
+}
+
 /* Don't grab lock, we're oopsing. */
 void print_modules(void)
 {
diff --git a/security/commoncap.c b/security/commoncap.c
index 4f8e093..236e573 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -1340,6 +1340,31 @@ int cap_mmap_file(struct file *file, unsigned long reqprot,
 	return 0;
 }
 
+/**
+ * cap_kernel_module_request - Determine whether a module auto-load is permitted
+ * @kmod_name: The module name
+ * @required_cap: if positive, may allow to auto-load the module if this
+ *	capability is set
+ * @kmod_prefix: the module prefix if any, otherwise NULL
+ *
+ * Determine whether a module should be automatically loaded.
+ * Returns 0 if the module request should be allowed, -EPERM if not.
+ */
+int cap_kernel_module_request(char *kmod_name, int required_cap,
+			      const char *kmod_prefix)
+{
+	int ret;
+	char comm[sizeof(current->comm)];
+
+	ret = may_autoload_module(kmod_name, required_cap, kmod_prefix);
+	if (ret < 0)
+		pr_notice_ratelimited(
+			"module: automatic module loading of %.64s by \"%s\"[%d] was denied\n",
+			kmod_name, get_task_comm(comm, current), current->pid);
+
+	return ret;
+}
+
 #ifdef CONFIG_SECURITY
 
 struct security_hook_list capability_hooks[] __lsm_ro_after_init = {
@@ -1361,6 +1386,7 @@ struct security_hook_list capability_hooks[] __lsm_ro_after_init = {
 	LSM_HOOK_INIT(task_setioprio, cap_task_setioprio),
 	LSM_HOOK_INIT(task_setnice, cap_task_setnice),
 	LSM_HOOK_INIT(vm_enough_memory, cap_vm_enough_memory),
+	LSM_HOOK_INIT(kernel_module_request, cap_kernel_module_request),
 };
 
 void __init capability_add_hooks(void)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 266+ messages in thread

* [PATCH v5 next 2/5] modules:capabilities: add cap_kernel_module_request() permission check
@ 2017-11-27 17:18   ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 17:18 UTC (permalink / raw)
  To: linux-security-module

This is a preparation patch to improve for the module auto-load
infrastrucutre.

With this change, subsystems that want to autoload modules and implement
onsite capability checks, can defer the checks to the capability
subsystem by passing the required capabilities with the appropriate
modules alias. The capability subsystem will trust callers about
the passed values and perform a capability check to either allow module
auto-loading or deny it.

This patch changes:
* Adds cap_kernel_module_request() capability hook.
* Adds an empty may_autoload_module() that will be updated in the next
  patch.

Cc: James Morris <james.l.morris@oracle.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
---
 include/linux/module.h   | 10 ++++++++++
 include/linux/security.h |  4 +++-
 kernel/module.c          | 23 +++++++++++++++++++++++
 security/commoncap.c     | 26 ++++++++++++++++++++++++++
 4 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/include/linux/module.h b/include/linux/module.h
index c69b49a..5cbb239 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -497,6 +497,10 @@ bool __is_module_percpu_address(unsigned long addr, unsigned long *can_addr);
 bool is_module_percpu_address(unsigned long addr);
 bool is_module_text_address(unsigned long addr);
 
+/* Determine whether a module auto-load operation is permitted. */
+int may_autoload_module(char *kmod_name, int required_cap,
+			const char *kmod_prefix);
+
 static inline bool within_module_core(unsigned long addr,
 				      const struct module *mod)
 {
@@ -643,6 +647,12 @@ bool is_module_sig_enforced(void);
 
 #else /* !CONFIG_MODULES... */
 
+static inline int may_autoload_module(char *kmod_name, int required_cap,
+				      const char *kmod_prefix)
+{
+	return -ENOSYS;
+}
+
 static inline struct module *__module_address(unsigned long addr)
 {
 	return NULL;
diff --git a/include/linux/security.h b/include/linux/security.h
index 41e700a..9bb53b5 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -102,6 +102,8 @@ extern int cap_task_setscheduler(struct task_struct *p);
 extern int cap_task_setioprio(struct task_struct *p, int ioprio);
 extern int cap_task_setnice(struct task_struct *p, int nice);
 extern int cap_vm_enough_memory(struct mm_struct *mm, long pages);
+extern int cap_kernel_module_request(char *kmod_name, int required_cap,
+				     const char *kmod_prefix);
 
 struct msghdr;
 struct sk_buff;
@@ -924,7 +926,7 @@ static inline int security_kernel_module_request(char *kmod_name,
 						 int required_cap,
 						 const char *prefix)
 {
-	return 0;
+	return cap_kernel_module_request(kmod_name, required_cap, prefix);
 }
 
 static inline int security_kernel_read_file(struct file *file,
diff --git a/kernel/module.c b/kernel/module.c
index f0411a2..3380d39 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -4340,6 +4340,29 @@ struct module *__module_text_address(unsigned long addr)
 }
 EXPORT_SYMBOL_GPL(__module_text_address);
 
+/**
+ * may_autoload_module - Determine whether a module auto-load operation
+ * is permitted
+ * @kmod_name: The module name
+ * @required_cap: if positive, may allow to auto-load the module if this
+ *	capability is set
+ * @kmod_prefix: The module prefix if any, otherwise NULL
+ *
+ * Determine whether a module auto-load operation is allowed or not.
+ *
+ * This allows to have more control on automatic module loading, and align it
+ * with explicit load/unload module operations. The kernel contains several
+ * modules, some of them are not updated often and may contain bugs and
+ * vulnerabilities.
+ *
+ * Returns 0 if the module request is allowed or -EPERM if not.
+ */
+int may_autoload_module(char *kmod_name, int required_cap,
+			const char *kmod_prefix)
+{
+	return 0;
+}
+
 /* Don't grab lock, we're oopsing. */
 void print_modules(void)
 {
diff --git a/security/commoncap.c b/security/commoncap.c
index 4f8e093..236e573 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -1340,6 +1340,31 @@ int cap_mmap_file(struct file *file, unsigned long reqprot,
 	return 0;
 }
 
+/**
+ * cap_kernel_module_request - Determine whether a module auto-load is permitted
+ * @kmod_name: The module name
+ * @required_cap: if positive, may allow to auto-load the module if this
+ *	capability is set
+ * @kmod_prefix: the module prefix if any, otherwise NULL
+ *
+ * Determine whether a module should be automatically loaded.
+ * Returns 0 if the module request should be allowed, -EPERM if not.
+ */
+int cap_kernel_module_request(char *kmod_name, int required_cap,
+			      const char *kmod_prefix)
+{
+	int ret;
+	char comm[sizeof(current->comm)];
+
+	ret = may_autoload_module(kmod_name, required_cap, kmod_prefix);
+	if (ret < 0)
+		pr_notice_ratelimited(
+			"module: automatic module loading of %.64s by \"%s\"[%d] was denied\n",
+			kmod_name, get_task_comm(comm, current), current->pid);
+
+	return ret;
+}
+
 #ifdef CONFIG_SECURITY
 
 struct security_hook_list capability_hooks[] __lsm_ro_after_init = {
@@ -1361,6 +1386,7 @@ struct security_hook_list capability_hooks[] __lsm_ro_after_init = {
 	LSM_HOOK_INIT(task_setioprio, cap_task_setioprio),
 	LSM_HOOK_INIT(task_setnice, cap_task_setnice),
 	LSM_HOOK_INIT(vm_enough_memory, cap_vm_enough_memory),
+	LSM_HOOK_INIT(kernel_module_request, cap_kernel_module_request),
 };
 
 void __init capability_add_hooks(void)
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 266+ messages in thread

* [kernel-hardening] [PATCH v5 next 2/5] modules:capabilities: add cap_kernel_module_request() permission check
@ 2017-11-27 17:18   ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 17:18 UTC (permalink / raw)
  To: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, linux-security-module,
	kernel-hardening
  Cc: Jonathan Corbet, Ingo Molnar, David S. Miller, netdev,
	Peter Zijlstra, Linus Torvalds, Djalal Harouni

This is a preparation patch to improve for the module auto-load
infrastrucutre.

With this change, subsystems that want to autoload modules and implement
onsite capability checks, can defer the checks to the capability
subsystem by passing the required capabilities with the appropriate
modules alias. The capability subsystem will trust callers about
the passed values and perform a capability check to either allow module
auto-loading or deny it.

This patch changes:
* Adds cap_kernel_module_request() capability hook.
* Adds an empty may_autoload_module() that will be updated in the next
  patch.

Cc: James Morris <james.l.morris@oracle.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
---
 include/linux/module.h   | 10 ++++++++++
 include/linux/security.h |  4 +++-
 kernel/module.c          | 23 +++++++++++++++++++++++
 security/commoncap.c     | 26 ++++++++++++++++++++++++++
 4 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/include/linux/module.h b/include/linux/module.h
index c69b49a..5cbb239 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -497,6 +497,10 @@ bool __is_module_percpu_address(unsigned long addr, unsigned long *can_addr);
 bool is_module_percpu_address(unsigned long addr);
 bool is_module_text_address(unsigned long addr);
 
+/* Determine whether a module auto-load operation is permitted. */
+int may_autoload_module(char *kmod_name, int required_cap,
+			const char *kmod_prefix);
+
 static inline bool within_module_core(unsigned long addr,
 				      const struct module *mod)
 {
@@ -643,6 +647,12 @@ bool is_module_sig_enforced(void);
 
 #else /* !CONFIG_MODULES... */
 
+static inline int may_autoload_module(char *kmod_name, int required_cap,
+				      const char *kmod_prefix)
+{
+	return -ENOSYS;
+}
+
 static inline struct module *__module_address(unsigned long addr)
 {
 	return NULL;
diff --git a/include/linux/security.h b/include/linux/security.h
index 41e700a..9bb53b5 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -102,6 +102,8 @@ extern int cap_task_setscheduler(struct task_struct *p);
 extern int cap_task_setioprio(struct task_struct *p, int ioprio);
 extern int cap_task_setnice(struct task_struct *p, int nice);
 extern int cap_vm_enough_memory(struct mm_struct *mm, long pages);
+extern int cap_kernel_module_request(char *kmod_name, int required_cap,
+				     const char *kmod_prefix);
 
 struct msghdr;
 struct sk_buff;
@@ -924,7 +926,7 @@ static inline int security_kernel_module_request(char *kmod_name,
 						 int required_cap,
 						 const char *prefix)
 {
-	return 0;
+	return cap_kernel_module_request(kmod_name, required_cap, prefix);
 }
 
 static inline int security_kernel_read_file(struct file *file,
diff --git a/kernel/module.c b/kernel/module.c
index f0411a2..3380d39 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -4340,6 +4340,29 @@ struct module *__module_text_address(unsigned long addr)
 }
 EXPORT_SYMBOL_GPL(__module_text_address);
 
+/**
+ * may_autoload_module - Determine whether a module auto-load operation
+ * is permitted
+ * @kmod_name: The module name
+ * @required_cap: if positive, may allow to auto-load the module if this
+ *	capability is set
+ * @kmod_prefix: The module prefix if any, otherwise NULL
+ *
+ * Determine whether a module auto-load operation is allowed or not.
+ *
+ * This allows to have more control on automatic module loading, and align it
+ * with explicit load/unload module operations. The kernel contains several
+ * modules, some of them are not updated often and may contain bugs and
+ * vulnerabilities.
+ *
+ * Returns 0 if the module request is allowed or -EPERM if not.
+ */
+int may_autoload_module(char *kmod_name, int required_cap,
+			const char *kmod_prefix)
+{
+	return 0;
+}
+
 /* Don't grab lock, we're oopsing. */
 void print_modules(void)
 {
diff --git a/security/commoncap.c b/security/commoncap.c
index 4f8e093..236e573 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -1340,6 +1340,31 @@ int cap_mmap_file(struct file *file, unsigned long reqprot,
 	return 0;
 }
 
+/**
+ * cap_kernel_module_request - Determine whether a module auto-load is permitted
+ * @kmod_name: The module name
+ * @required_cap: if positive, may allow to auto-load the module if this
+ *	capability is set
+ * @kmod_prefix: the module prefix if any, otherwise NULL
+ *
+ * Determine whether a module should be automatically loaded.
+ * Returns 0 if the module request should be allowed, -EPERM if not.
+ */
+int cap_kernel_module_request(char *kmod_name, int required_cap,
+			      const char *kmod_prefix)
+{
+	int ret;
+	char comm[sizeof(current->comm)];
+
+	ret = may_autoload_module(kmod_name, required_cap, kmod_prefix);
+	if (ret < 0)
+		pr_notice_ratelimited(
+			"module: automatic module loading of %.64s by \"%s\"[%d] was denied\n",
+			kmod_name, get_task_comm(comm, current), current->pid);
+
+	return ret;
+}
+
 #ifdef CONFIG_SECURITY
 
 struct security_hook_list capability_hooks[] __lsm_ro_after_init = {
@@ -1361,6 +1386,7 @@ struct security_hook_list capability_hooks[] __lsm_ro_after_init = {
 	LSM_HOOK_INIT(task_setioprio, cap_task_setioprio),
 	LSM_HOOK_INIT(task_setnice, cap_task_setnice),
 	LSM_HOOK_INIT(vm_enough_memory, cap_vm_enough_memory),
+	LSM_HOOK_INIT(kernel_module_request, cap_kernel_module_request),
 };
 
 void __init capability_add_hooks(void)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 266+ messages in thread

* [PATCH v5 next 3/5] modules:capabilities: automatic module loading restriction
  2017-11-27 17:18 ` Djalal Harouni
  (?)
@ 2017-11-27 17:18   ` Djalal Harouni
  -1 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 17:18 UTC (permalink / raw)
  To: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, linux-security-module,
	kernel-hardening
  Cc: Jonathan Corbet, Ingo Molnar, David S. Miller, netdev,
	Peter Zijlstra, Linus Torvalds, Djalal Harouni

Currently, an explicit call to load or unload kernel modules require
CAP_SYS_MODULE capability. However unprivileged users have always been
able to load some modules using the implicit auto-load operation. An
automatic module loading happens when programs request a kernel feature
from a module that is not loaded. In order to satisfy userspace, the
kernel then automatically load all these required modules.

Historically, the kernel was always able to automatically load modules
if they are not blacklisted. This is one of the most important and
transparent operations of Linux, it allows to provide numerous other
features as they are needed which is crucial for a better user experience.
However, as Linux is popular now and used for different appliances some
of these may need to control such operations. For such systems, recent
needs showed that in some cases allowing to control automatic module
loading is as important as the operation itself. Restricting unprivileged
programs or attackers that abuse this feature to load unused modules or
modules that contain bugs is a significant security measure.

This allows administrators or some special programs to have the
appropriate time to update and deny module autoloading in advance, then
blacklist the corresponding ones. Not doing so may affect the global state
of the machine, especially containers where some apps are moved from one
context to another and not having such mechanisms may allow to expose
and exploit the vulnerable parts to escape the container sandbox.

Embedded or IoT devices also started to ship as containers using generic
distros, some vendors do not have the appropriate time to make their own
OS, hence, using base images is getting popular. These setups may include
unnecessary modules that the final applications will not need. Untrusted
access may abuse the module auto-load feature to expose vulnerabilities.

As every code contains bugs or vulnerabilties, the following
vulnerabilities that affected some features that are often compiled as
modules could have been completely blocked, by restricting autoloading
modules if the system does not need them.

Past months:
* DCCP use after free CVE-2017-6074 [1] [2]
  Unprivileged to local root.

* XFRM framework CVE-2017-7184 [3]
  As advertised it seems it was used to break Ubuntu on a security
  contest.

* n_hldc CVE-2017-2636 [4] [5]
  Local privilege escalation.

* L2TPv3 CVE-2016-10200

The list is longer.

To improve the current status, this patch introduces "modules_autoload_mode"
kernel sysctl flag. The flag controls modules auto-load feature and
complements "modules_disabled" which apply to all modules operations.
This new flag allows to control only automatic module loading and if it is
allowed or not, aligning in the process the implicit operation with the
explicit one where both now are covered by capabilities checks.

The three modes that "modules_autoload_mode" support allow to provide
restrictions on automatic module loading without breaking user
experience.

The sysctl flag is available at "/proc/sys/kernel/modules_autoload_mode"

When modules_autoload_mode is set to (0), the default, there are no
restrictions.

When modules_autoload_mode is set to (1), processes must have
CAP_SYS_MODULE to be able to trigger a module auto-load operation,
or CAP_NET_ADMIN for modules with a 'netdev-%s' alias, or other
capabilities for specific aliased modules.

When modules_autoload_mode is set to (2), automatic module loading
is disabled for all.

Notes on relation between "modules_disabled=0" and
"modules_autoload_mode=2":
1) Once "modules_disabled=1" set, it needs a reboot to undo the
setting.

2) Restricting automatic module loading does not interfere with
explicit module load or unload operations.

3) New features provided by modules can be made available without
rebooting the system.

4) A bad version of a module can be unloaded and replaced with a
better one without rebooting the system.

The idea of module auto-load restriction was inspired from grsecurity
'GRKERNSEC_MODHARDEN' config option. Upstream Linux implementation is
more focused on the run-time behavior with a three mode switch.

Testing
-------

Example 1)

Before:
$ lsmod | grep ipip -
$ sudo ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
$ lsmod | grep ipip -
ipip                   16384  0
tunnel4                16384  1 ipip
ip_tunnel              28672  1 ipip
$ cat /proc/sys/kernel/modules_autoload_mode
0

After:
$ lsmod | grep ipip -
$ sudo ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
add tunnel "tunl0" failed: No such device
$ dmesg
...
[ 1876.378389] module: automatic module loading of netdev-tunl0 by "ip"[1453] was denied
[ 1876.380994] module: automatic module loading of tunl0 by "ip"[1453] was denied
...
$ lsmod | grep ipip -
$

Example 2)

DCCP use after free CVE-2017-6074:
The code path can be triggered by unprivileged, using the trigger.c
program for DCCP use after free [2] and that was fixed by
commit 5edabca9d4cff7f "dccp: fix freeing skb too early for IPV6_RECVPKTINFO".

Before:
$ lsmod | grep dccp
$ strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = 3
...
$ lsmod | grep dccp
dccp_ipv6              24576  5
dccp_ipv4              24576  5 dccp_ipv6
dccp                  102400  2 dccp_ipv6,dccp_ipv4

After:
Only privileged:
$ lsmod | grep dccp
$ strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported)
...
$ lsmod | grep dccp
$ dmesg
...
[  175.945063] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1390] was denied
[  175.947952] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1390] was denied
[  175.956061] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1390] was denied
[  175.959733] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1390] was denied

$ sudo strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = 3
...
$ lsmod | grep dccp
dccp_ipv6              24576  6
dccp_ipv4              24576  5 dccp_ipv6
dccp                  102400  2 dccp_ipv6,dccp_ipv4

Disable automatic module loading:
$ lsmod | grep dccp
$ su - root
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported)
...
$ lsmod | grep dccp
$ dmesg
...
[  126.596545] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1291] was denied
[  126.598800] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1291] was denied
[  126.601264] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1291] was denied
[  126.602839] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1291] was denied

As an example, this blocks abuses, DCCP still can be explicilty loaded by
an administrator using modprobe, at same time automatic module loading is
disabled forever.

[1] http://www.openwall.com/lists/oss-security/2017/02/22/3
[2] https://github.com/xairy/kernel-exploits/tree/master/CVE-2017-6074
[3] http://www.openwall.com/lists/oss-security/2017/03/29/2
[4] http://www.openwall.com/lists/oss-security/2017/03/07/6
[5] https://a13xp0p0v.github.io/2017/03/24/CVE-2017-2636.html

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Cc: Solar Designer <solar@openwall.com>
Cc: Andy Lutomirski <luto@kernel.org>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
---
 Documentation/sysctl/kernel.txt | 54 +++++++++++++++++++++++++++
 include/linux/module.h          | 11 +++++-
 kernel/module.c                 | 81 ++++++++++++++++++++++++++++++++++++++++-
 kernel/sysctl.c                 | 28 ++++++++++++++
 4 files changed, 172 insertions(+), 2 deletions(-)

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 694968c..dc44075 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -43,6 +43,7 @@ show up in /proc/sys/kernel:
 - l2cr                        [ PPC only ]
 - modprobe                    ==> Documentation/debugging-modules.txt
 - modules_disabled
+- modules_autoload_mode
 - msg_next_id		      [ sysv ipc ]
 - msgmax
 - msgmnb
@@ -413,6 +414,59 @@ to false.  Generally used with the "kexec_load_disabled" toggle.
 
 ==============================================================
 
+modules_autoload_mode:
+
+A sysctl to control if modules auto-load feature is allowed or not.
+This sysctl complements "modules_disabled" which is for all module
+operations where this flag applies only to automatic module loading.
+Automatic module loading happens when programs request a kernel
+feature that is implemented by an unloaded module, the kernel
+automatically runs the program pointed by "modprobe" sysctl in order
+to load the corresponding module.
+
+Historically, the kernel was always able to automatically load modules
+if they are not blacklisted. This is one of the most important and
+transparent operations of Linux, it allows to provide numerous other
+features as they are needed which is crucial for a better user experience.
+However, as Linux is popular now and used for different appliances some
+of these may need to control such operations. For such systems, recent
+needs showed that in some cases allowing to control automatic module
+loading is as important as the operation itself. Restricting unprivileged
+programs or attackers that abuse this feature to load unused modules or
+modules that contain bugs is a significant security measure.
+
+The three modes that "modules_autoload_mode" support allow to provide
+restrictions on automatic module loading without breaking user
+experience.
+
+When modules_autoload_mode is set to (0), the default, there are no
+restrictions.
+
+When modules_autoload_mode is set to (1), processes must have
+CAP_SYS_MODULE to be able to trigger a module auto-load operation,
+CAP_NET_ADMIN for modules with a 'netdev-%s' alias, or other
+capabilities for specific aliased modules.
+
+When modules_autoload_mode is set to (2), automatic module loading
+is disabled for all.
+
+
+Notes on relation between "modules_disabled=0" and
+"modules_autoload_mode=2":
+1) Once "modules_disabled=1" set, it needs a reboot to undo the
+setting.
+2) Restricting automatic module loading does not interfere with
+explicit module load or unload operations.
+3) New features provided by modules can be made available without
+rebooting the system.
+4) A bad version of a module can still be unloaded and replaced with
+a better one without rebooting the system.
+
+The idea of module auto-load restriction was inspired from grsecurity
+'GRKERNSEC_MODHARDEN' config option.
+
+==============================================================
+
 msg_next_id, sem_next_id, and shm_next_id:
 
 These three toggles allows to specify desired id for next allocated IPC
diff --git a/include/linux/module.h b/include/linux/module.h
index 5cbb239..c36aed8 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -261,7 +261,16 @@ struct notifier_block;
 
 #ifdef CONFIG_MODULES
 
-extern int modules_disabled; /* for sysctl */
+enum {
+	MODULES_AUTOLOAD_ALLOWED	= 0,
+	MODULES_AUTOLOAD_PRIVILEGED	= 1,
+	MODULES_AUTOLOAD_DISABLED	= 2,
+};
+
+extern int modules_disabled; /* sysctl for explicit module load/unload */
+extern int modules_autoload_mode; /* sysctl for automatic module loading */
+extern const int modules_autoload_max; /* max value for modules_autoload_mode */
+
 /* Get/put a kernel symbol (calls must be symmetric) */
 void *__symbol_get(const char *symbol);
 void *__symbol_get_gpl(const char *symbol);
diff --git a/kernel/module.c b/kernel/module.c
index 3380d39..a7205fb 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -290,6 +290,8 @@ EXPORT_SYMBOL(is_module_sig_enforced);
 
 /* Block module loading/unloading? */
 int modules_disabled = 0;
+int modules_autoload_mode = MODULES_AUTOLOAD_ALLOWED;
+const int modules_autoload_max = MODULES_AUTOLOAD_DISABLED;
 core_param(nomodule, modules_disabled, bint, 0);
 
 /* Waiting for a module to finish initializing? */
@@ -4355,12 +4357,89 @@ EXPORT_SYMBOL_GPL(__module_text_address);
  * modules, some of them are not updated often and may contain bugs and
  * vulnerabilities.
  *
+ * If "@required_cap" is positive and a valid capability then it is checked
+ * together with the "@kmod_prefix" to either allow or deny automatic module
+ * loading.
+ *
+ * However even if the caller has the required capability, the operation can
+ * still be denied due to the global "modules_autoload_mode" sysctl mode. Unless
+ * set by enduser, the operation is always allowed which is the default.
+ *
+ * The permission check is performed in this order:
+ * 1) If the global sysctl "modules_autoload_mode" is set to 'disabled', then
+ *    operation is denied.
+ *
+ * 2) If the global sysctl "modules_autoload_mode" is set to 'privileged', then:
+ *
+ *   2.1) If "@required_cap" is positive and "@kmod_prefix" is set, then
+ *   if the caller has the capability, the operation is allowed.
+ *
+ *   2.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we
+ *   fallback to check if caller has CAP_SYS_MODULE, if so, operation is
+ *   allowed.
+ *
+ *   2.3) If caller passes "@required_cap" as a negative then we fallback to
+ *   check if caller has CAP_SYS_MODULE, if so, operation is allowed.
+ *
+ *   We require capabilities to autoload modules here, and CAP_SYS_MODULE here is
+ *   the default.
+ *
+ *   2.4) Otherwise operation is denied.
+ *
+ * 3) If the global sysctl "modules_autoload_mode" is set to 'allowed' which is
+ *    the default, then:
+ *
+ *   3.1) If "@required_cap" is positive and "@kmod_prefix" is set, we check if
+ *   caller has the capability, if so, operation is allowed.
+ *   In this case the calling subsystem requires the capability to be set before
+ *   allowing modules autoload operations and we have to honor that.
+ *
+ *   3.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we
+ *   fallback to check if caller has CAP_SYS_MODULE, if so, operation is
+ *   allowed.
+ *
+ *   3.3) If caller passes "@required_cap" as a negative then operation is
+ *   allowed. This is the most common case as it is used now by
+ *   request_module() function.
+ *
+ *   3.4) Otherwise operation is denied.
+ *
  * Returns 0 if the module request is allowed or -EPERM if not.
  */
 int may_autoload_module(char *kmod_name, int required_cap,
 			const char *kmod_prefix)
 {
-	return 0;
+	int module_require_cap = CAP_SYS_MODULE;
+	unsigned int autoload = modules_autoload_mode;
+
+	/* Short-cut for most use cases where kmod auto-loading is allowed */
+	if (autoload == MODULES_AUTOLOAD_ALLOWED && required_cap < 0)
+		return 0;
+
+	/* If autoload is disabled then fail here */
+	if (autoload == MODULES_AUTOLOAD_DISABLED)
+		return -EPERM;
+
+	/* If caller requires privileges */
+	if (required_cap > 0) {
+		/*
+		 * If '@kmod_prefix' is set then use the '@required_cap'.
+		 * This allows to cover 'netdev-%s' alias modules and others
+		 * with their corresponding capability
+		 */
+		if (kmod_prefix != NULL && *kmod_prefix != '\0')
+			module_require_cap = required_cap;
+	}
+
+	/*
+	 * We require privileges if '@required_cap' was set or if the
+	 * 'modules_autoload_mode' is set to 'privileged' mode.
+	 */
+	if (capable(module_require_cap))
+		return 0;
+
+	/* Otherwise fail */
+	return -EPERM;
 }
 
 /* Don't grab lock, we're oopsing. */
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 2fb4e27..0b6f0c8 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -207,6 +207,11 @@ static int proc_taint(struct ctl_table *table, int write,
 			       void __user *buffer, size_t *lenp, loff_t *ppos);
 #endif
 
+#ifdef CONFIG_MODULES
+static int modules_autoload_dointvec_minmax(struct ctl_table *table, int write,
+				void __user *buffer, size_t *lenp, loff_t *ppos);
+#endif
+
 #ifdef CONFIG_PRINTK
 static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write,
 				void __user *buffer, size_t *lenp, loff_t *ppos);
@@ -683,6 +688,15 @@ static struct ctl_table kern_table[] = {
 		.extra1		= &one,
 		.extra2		= &one,
 	},
+	{
+		.procname	= "modules_autoload_mode",
+		.data		= &modules_autoload_mode,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= modules_autoload_dointvec_minmax,
+		.extra1		= &zero,
+		.extra2		= (void *)&modules_autoload_max,
+	},
 #endif
 #ifdef CONFIG_UEVENT_HELPER
 	{
@@ -2499,6 +2513,20 @@ static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write,
 }
 #endif
 
+#ifdef CONFIG_MODULES
+static int modules_autoload_dointvec_minmax(struct ctl_table *table, int write,
+				void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	/*
+	 * Only CAP_SYS_MODULE in init user namespace are allowed to change this
+	 */
+	if (write && !capable(CAP_SYS_MODULE))
+		return -EPERM;
+
+	return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+}
+#endif
+
 struct do_proc_dointvec_minmax_conv_param {
 	int *min;
 	int *max;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 266+ messages in thread

* [PATCH v5 next 3/5] modules:capabilities: automatic module loading restriction
@ 2017-11-27 17:18   ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 17:18 UTC (permalink / raw)
  To: linux-security-module

Currently, an explicit call to load or unload kernel modules require
CAP_SYS_MODULE capability. However unprivileged users have always been
able to load some modules using the implicit auto-load operation. An
automatic module loading happens when programs request a kernel feature
from a module that is not loaded. In order to satisfy userspace, the
kernel then automatically load all these required modules.

Historically, the kernel was always able to automatically load modules
if they are not blacklisted. This is one of the most important and
transparent operations of Linux, it allows to provide numerous other
features as they are needed which is crucial for a better user experience.
However, as Linux is popular now and used for different appliances some
of these may need to control such operations. For such systems, recent
needs showed that in some cases allowing to control automatic module
loading is as important as the operation itself. Restricting unprivileged
programs or attackers that abuse this feature to load unused modules or
modules that contain bugs is a significant security measure.

This allows administrators or some special programs to have the
appropriate time to update and deny module autoloading in advance, then
blacklist the corresponding ones. Not doing so may affect the global state
of the machine, especially containers where some apps are moved from one
context to another and not having such mechanisms may allow to expose
and exploit the vulnerable parts to escape the container sandbox.

Embedded or IoT devices also started to ship as containers using generic
distros, some vendors do not have the appropriate time to make their own
OS, hence, using base images is getting popular. These setups may include
unnecessary modules that the final applications will not need. Untrusted
access may abuse the module auto-load feature to expose vulnerabilities.

As every code contains bugs or vulnerabilties, the following
vulnerabilities that affected some features that are often compiled as
modules could have been completely blocked, by restricting autoloading
modules if the system does not need them.

Past months:
* DCCP use after free CVE-2017-6074 [1] [2]
  Unprivileged to local root.

* XFRM framework CVE-2017-7184 [3]
  As advertised it seems it was used to break Ubuntu on a security
  contest.

* n_hldc CVE-2017-2636 [4] [5]
  Local privilege escalation.

* L2TPv3 CVE-2016-10200

The list is longer.

To improve the current status, this patch introduces "modules_autoload_mode"
kernel sysctl flag. The flag controls modules auto-load feature and
complements "modules_disabled" which apply to all modules operations.
This new flag allows to control only automatic module loading and if it is
allowed or not, aligning in the process the implicit operation with the
explicit one where both now are covered by capabilities checks.

The three modes that "modules_autoload_mode" support allow to provide
restrictions on automatic module loading without breaking user
experience.

The sysctl flag is available at "/proc/sys/kernel/modules_autoload_mode"

When modules_autoload_mode is set to (0), the default, there are no
restrictions.

When modules_autoload_mode is set to (1), processes must have
CAP_SYS_MODULE to be able to trigger a module auto-load operation,
or CAP_NET_ADMIN for modules with a 'netdev-%s' alias, or other
capabilities for specific aliased modules.

When modules_autoload_mode is set to (2), automatic module loading
is disabled for all.

Notes on relation between "modules_disabled=0" and
"modules_autoload_mode=2":
1) Once "modules_disabled=1" set, it needs a reboot to undo the
setting.

2) Restricting automatic module loading does not interfere with
explicit module load or unload operations.

3) New features provided by modules can be made available without
rebooting the system.

4) A bad version of a module can be unloaded and replaced with a
better one without rebooting the system.

The idea of module auto-load restriction was inspired from grsecurity
'GRKERNSEC_MODHARDEN' config option. Upstream Linux implementation is
more focused on the run-time behavior with a three mode switch.

Testing
-------

Example 1)

Before:
$ lsmod | grep ipip -
$ sudo ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
$ lsmod | grep ipip -
ipip                   16384  0
tunnel4                16384  1 ipip
ip_tunnel              28672  1 ipip
$ cat /proc/sys/kernel/modules_autoload_mode
0

After:
$ lsmod | grep ipip -
$ sudo ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
add tunnel "tunl0" failed: No such device
$ dmesg
...
[ 1876.378389] module: automatic module loading of netdev-tunl0 by "ip"[1453] was denied
[ 1876.380994] module: automatic module loading of tunl0 by "ip"[1453] was denied
...
$ lsmod | grep ipip -
$

Example 2)

DCCP use after free CVE-2017-6074:
The code path can be triggered by unprivileged, using the trigger.c
program for DCCP use after free [2] and that was fixed by
commit 5edabca9d4cff7f "dccp: fix freeing skb too early for IPV6_RECVPKTINFO".

Before:
$ lsmod | grep dccp
$ strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = 3
...
$ lsmod | grep dccp
dccp_ipv6              24576  5
dccp_ipv4              24576  5 dccp_ipv6
dccp                  102400  2 dccp_ipv6,dccp_ipv4

After:
Only privileged:
$ lsmod | grep dccp
$ strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported)
...
$ lsmod | grep dccp
$ dmesg
...
[  175.945063] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1390] was denied
[  175.947952] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1390] was denied
[  175.956061] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1390] was denied
[  175.959733] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1390] was denied

$ sudo strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = 3
...
$ lsmod | grep dccp
dccp_ipv6              24576  6
dccp_ipv4              24576  5 dccp_ipv6
dccp                  102400  2 dccp_ipv6,dccp_ipv4

Disable automatic module loading:
$ lsmod | grep dccp
$ su - root
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported)
...
$ lsmod | grep dccp
$ dmesg
...
[  126.596545] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1291] was denied
[  126.598800] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1291] was denied
[  126.601264] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1291] was denied
[  126.602839] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1291] was denied

As an example, this blocks abuses, DCCP still can be explicilty loaded by
an administrator using modprobe, at same time automatic module loading is
disabled forever.

[1] http://www.openwall.com/lists/oss-security/2017/02/22/3
[2] https://github.com/xairy/kernel-exploits/tree/master/CVE-2017-6074
[3] http://www.openwall.com/lists/oss-security/2017/03/29/2
[4] http://www.openwall.com/lists/oss-security/2017/03/07/6
[5] https://a13xp0p0v.github.io/2017/03/24/CVE-2017-2636.html

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Cc: Solar Designer <solar@openwall.com>
Cc: Andy Lutomirski <luto@kernel.org>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
---
 Documentation/sysctl/kernel.txt | 54 +++++++++++++++++++++++++++
 include/linux/module.h          | 11 +++++-
 kernel/module.c                 | 81 ++++++++++++++++++++++++++++++++++++++++-
 kernel/sysctl.c                 | 28 ++++++++++++++
 4 files changed, 172 insertions(+), 2 deletions(-)

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 694968c..dc44075 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -43,6 +43,7 @@ show up in /proc/sys/kernel:
 - l2cr                        [ PPC only ]
 - modprobe                    ==> Documentation/debugging-modules.txt
 - modules_disabled
+- modules_autoload_mode
 - msg_next_id		      [ sysv ipc ]
 - msgmax
 - msgmnb
@@ -413,6 +414,59 @@ to false.  Generally used with the "kexec_load_disabled" toggle.
 
 ==============================================================
 
+modules_autoload_mode:
+
+A sysctl to control if modules auto-load feature is allowed or not.
+This sysctl complements "modules_disabled" which is for all module
+operations where this flag applies only to automatic module loading.
+Automatic module loading happens when programs request a kernel
+feature that is implemented by an unloaded module, the kernel
+automatically runs the program pointed by "modprobe" sysctl in order
+to load the corresponding module.
+
+Historically, the kernel was always able to automatically load modules
+if they are not blacklisted. This is one of the most important and
+transparent operations of Linux, it allows to provide numerous other
+features as they are needed which is crucial for a better user experience.
+However, as Linux is popular now and used for different appliances some
+of these may need to control such operations. For such systems, recent
+needs showed that in some cases allowing to control automatic module
+loading is as important as the operation itself. Restricting unprivileged
+programs or attackers that abuse this feature to load unused modules or
+modules that contain bugs is a significant security measure.
+
+The three modes that "modules_autoload_mode" support allow to provide
+restrictions on automatic module loading without breaking user
+experience.
+
+When modules_autoload_mode is set to (0), the default, there are no
+restrictions.
+
+When modules_autoload_mode is set to (1), processes must have
+CAP_SYS_MODULE to be able to trigger a module auto-load operation,
+CAP_NET_ADMIN for modules with a 'netdev-%s' alias, or other
+capabilities for specific aliased modules.
+
+When modules_autoload_mode is set to (2), automatic module loading
+is disabled for all.
+
+
+Notes on relation between "modules_disabled=0" and
+"modules_autoload_mode=2":
+1) Once "modules_disabled=1" set, it needs a reboot to undo the
+setting.
+2) Restricting automatic module loading does not interfere with
+explicit module load or unload operations.
+3) New features provided by modules can be made available without
+rebooting the system.
+4) A bad version of a module can still be unloaded and replaced with
+a better one without rebooting the system.
+
+The idea of module auto-load restriction was inspired from grsecurity
+'GRKERNSEC_MODHARDEN' config option.
+
+==============================================================
+
 msg_next_id, sem_next_id, and shm_next_id:
 
 These three toggles allows to specify desired id for next allocated IPC
diff --git a/include/linux/module.h b/include/linux/module.h
index 5cbb239..c36aed8 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -261,7 +261,16 @@ struct notifier_block;
 
 #ifdef CONFIG_MODULES
 
-extern int modules_disabled; /* for sysctl */
+enum {
+	MODULES_AUTOLOAD_ALLOWED	= 0,
+	MODULES_AUTOLOAD_PRIVILEGED	= 1,
+	MODULES_AUTOLOAD_DISABLED	= 2,
+};
+
+extern int modules_disabled; /* sysctl for explicit module load/unload */
+extern int modules_autoload_mode; /* sysctl for automatic module loading */
+extern const int modules_autoload_max; /* max value for modules_autoload_mode */
+
 /* Get/put a kernel symbol (calls must be symmetric) */
 void *__symbol_get(const char *symbol);
 void *__symbol_get_gpl(const char *symbol);
diff --git a/kernel/module.c b/kernel/module.c
index 3380d39..a7205fb 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -290,6 +290,8 @@ EXPORT_SYMBOL(is_module_sig_enforced);
 
 /* Block module loading/unloading? */
 int modules_disabled = 0;
+int modules_autoload_mode = MODULES_AUTOLOAD_ALLOWED;
+const int modules_autoload_max = MODULES_AUTOLOAD_DISABLED;
 core_param(nomodule, modules_disabled, bint, 0);
 
 /* Waiting for a module to finish initializing? */
@@ -4355,12 +4357,89 @@ EXPORT_SYMBOL_GPL(__module_text_address);
  * modules, some of them are not updated often and may contain bugs and
  * vulnerabilities.
  *
+ * If "@required_cap" is positive and a valid capability then it is checked
+ * together with the "@kmod_prefix" to either allow or deny automatic module
+ * loading.
+ *
+ * However even if the caller has the required capability, the operation can
+ * still be denied due to the global "modules_autoload_mode" sysctl mode. Unless
+ * set by enduser, the operation is always allowed which is the default.
+ *
+ * The permission check is performed in this order:
+ * 1) If the global sysctl "modules_autoload_mode" is set to 'disabled', then
+ *    operation is denied.
+ *
+ * 2) If the global sysctl "modules_autoload_mode" is set to 'privileged', then:
+ *
+ *   2.1) If "@required_cap" is positive and "@kmod_prefix" is set, then
+ *   if the caller has the capability, the operation is allowed.
+ *
+ *   2.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we
+ *   fallback to check if caller has CAP_SYS_MODULE, if so, operation is
+ *   allowed.
+ *
+ *   2.3) If caller passes "@required_cap" as a negative then we fallback to
+ *   check if caller has CAP_SYS_MODULE, if so, operation is allowed.
+ *
+ *   We require capabilities to autoload modules here, and CAP_SYS_MODULE here is
+ *   the default.
+ *
+ *   2.4) Otherwise operation is denied.
+ *
+ * 3) If the global sysctl "modules_autoload_mode" is set to 'allowed' which is
+ *    the default, then:
+ *
+ *   3.1) If "@required_cap" is positive and "@kmod_prefix" is set, we check if
+ *   caller has the capability, if so, operation is allowed.
+ *   In this case the calling subsystem requires the capability to be set before
+ *   allowing modules autoload operations and we have to honor that.
+ *
+ *   3.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we
+ *   fallback to check if caller has CAP_SYS_MODULE, if so, operation is
+ *   allowed.
+ *
+ *   3.3) If caller passes "@required_cap" as a negative then operation is
+ *   allowed. This is the most common case as it is used now by
+ *   request_module() function.
+ *
+ *   3.4) Otherwise operation is denied.
+ *
  * Returns 0 if the module request is allowed or -EPERM if not.
  */
 int may_autoload_module(char *kmod_name, int required_cap,
 			const char *kmod_prefix)
 {
-	return 0;
+	int module_require_cap = CAP_SYS_MODULE;
+	unsigned int autoload = modules_autoload_mode;
+
+	/* Short-cut for most use cases where kmod auto-loading is allowed */
+	if (autoload == MODULES_AUTOLOAD_ALLOWED && required_cap < 0)
+		return 0;
+
+	/* If autoload is disabled then fail here */
+	if (autoload == MODULES_AUTOLOAD_DISABLED)
+		return -EPERM;
+
+	/* If caller requires privileges */
+	if (required_cap > 0) {
+		/*
+		 * If '@kmod_prefix' is set then use the '@required_cap'.
+		 * This allows to cover 'netdev-%s' alias modules and others
+		 * with their corresponding capability
+		 */
+		if (kmod_prefix != NULL && *kmod_prefix != '\0')
+			module_require_cap = required_cap;
+	}
+
+	/*
+	 * We require privileges if '@required_cap' was set or if the
+	 * 'modules_autoload_mode' is set to 'privileged' mode.
+	 */
+	if (capable(module_require_cap))
+		return 0;
+
+	/* Otherwise fail */
+	return -EPERM;
 }
 
 /* Don't grab lock, we're oopsing. */
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 2fb4e27..0b6f0c8 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -207,6 +207,11 @@ static int proc_taint(struct ctl_table *table, int write,
 			       void __user *buffer, size_t *lenp, loff_t *ppos);
 #endif
 
+#ifdef CONFIG_MODULES
+static int modules_autoload_dointvec_minmax(struct ctl_table *table, int write,
+				void __user *buffer, size_t *lenp, loff_t *ppos);
+#endif
+
 #ifdef CONFIG_PRINTK
 static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write,
 				void __user *buffer, size_t *lenp, loff_t *ppos);
@@ -683,6 +688,15 @@ static struct ctl_table kern_table[] = {
 		.extra1		= &one,
 		.extra2		= &one,
 	},
+	{
+		.procname	= "modules_autoload_mode",
+		.data		= &modules_autoload_mode,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= modules_autoload_dointvec_minmax,
+		.extra1		= &zero,
+		.extra2		= (void *)&modules_autoload_max,
+	},
 #endif
 #ifdef CONFIG_UEVENT_HELPER
 	{
@@ -2499,6 +2513,20 @@ static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write,
 }
 #endif
 
+#ifdef CONFIG_MODULES
+static int modules_autoload_dointvec_minmax(struct ctl_table *table, int write,
+				void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	/*
+	 * Only CAP_SYS_MODULE in init user namespace are allowed to change this
+	 */
+	if (write && !capable(CAP_SYS_MODULE))
+		return -EPERM;
+
+	return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+}
+#endif
+
 struct do_proc_dointvec_minmax_conv_param {
 	int *min;
 	int *max;
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 266+ messages in thread

* [kernel-hardening] [PATCH v5 next 3/5] modules:capabilities: automatic module loading restriction
@ 2017-11-27 17:18   ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 17:18 UTC (permalink / raw)
  To: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, linux-security-module,
	kernel-hardening
  Cc: Jonathan Corbet, Ingo Molnar, David S. Miller, netdev,
	Peter Zijlstra, Linus Torvalds, Djalal Harouni

Currently, an explicit call to load or unload kernel modules require
CAP_SYS_MODULE capability. However unprivileged users have always been
able to load some modules using the implicit auto-load operation. An
automatic module loading happens when programs request a kernel feature
from a module that is not loaded. In order to satisfy userspace, the
kernel then automatically load all these required modules.

Historically, the kernel was always able to automatically load modules
if they are not blacklisted. This is one of the most important and
transparent operations of Linux, it allows to provide numerous other
features as they are needed which is crucial for a better user experience.
However, as Linux is popular now and used for different appliances some
of these may need to control such operations. For such systems, recent
needs showed that in some cases allowing to control automatic module
loading is as important as the operation itself. Restricting unprivileged
programs or attackers that abuse this feature to load unused modules or
modules that contain bugs is a significant security measure.

This allows administrators or some special programs to have the
appropriate time to update and deny module autoloading in advance, then
blacklist the corresponding ones. Not doing so may affect the global state
of the machine, especially containers where some apps are moved from one
context to another and not having such mechanisms may allow to expose
and exploit the vulnerable parts to escape the container sandbox.

Embedded or IoT devices also started to ship as containers using generic
distros, some vendors do not have the appropriate time to make their own
OS, hence, using base images is getting popular. These setups may include
unnecessary modules that the final applications will not need. Untrusted
access may abuse the module auto-load feature to expose vulnerabilities.

As every code contains bugs or vulnerabilties, the following
vulnerabilities that affected some features that are often compiled as
modules could have been completely blocked, by restricting autoloading
modules if the system does not need them.

Past months:
* DCCP use after free CVE-2017-6074 [1] [2]
  Unprivileged to local root.

* XFRM framework CVE-2017-7184 [3]
  As advertised it seems it was used to break Ubuntu on a security
  contest.

* n_hldc CVE-2017-2636 [4] [5]
  Local privilege escalation.

* L2TPv3 CVE-2016-10200

The list is longer.

To improve the current status, this patch introduces "modules_autoload_mode"
kernel sysctl flag. The flag controls modules auto-load feature and
complements "modules_disabled" which apply to all modules operations.
This new flag allows to control only automatic module loading and if it is
allowed or not, aligning in the process the implicit operation with the
explicit one where both now are covered by capabilities checks.

The three modes that "modules_autoload_mode" support allow to provide
restrictions on automatic module loading without breaking user
experience.

The sysctl flag is available at "/proc/sys/kernel/modules_autoload_mode"

When modules_autoload_mode is set to (0), the default, there are no
restrictions.

When modules_autoload_mode is set to (1), processes must have
CAP_SYS_MODULE to be able to trigger a module auto-load operation,
or CAP_NET_ADMIN for modules with a 'netdev-%s' alias, or other
capabilities for specific aliased modules.

When modules_autoload_mode is set to (2), automatic module loading
is disabled for all.

Notes on relation between "modules_disabled=0" and
"modules_autoload_mode=2":
1) Once "modules_disabled=1" set, it needs a reboot to undo the
setting.

2) Restricting automatic module loading does not interfere with
explicit module load or unload operations.

3) New features provided by modules can be made available without
rebooting the system.

4) A bad version of a module can be unloaded and replaced with a
better one without rebooting the system.

The idea of module auto-load restriction was inspired from grsecurity
'GRKERNSEC_MODHARDEN' config option. Upstream Linux implementation is
more focused on the run-time behavior with a three mode switch.

Testing
-------

Example 1)

Before:
$ lsmod | grep ipip -
$ sudo ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
$ lsmod | grep ipip -
ipip                   16384  0
tunnel4                16384  1 ipip
ip_tunnel              28672  1 ipip
$ cat /proc/sys/kernel/modules_autoload_mode
0

After:
$ lsmod | grep ipip -
$ sudo ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
add tunnel "tunl0" failed: No such device
$ dmesg
...
[ 1876.378389] module: automatic module loading of netdev-tunl0 by "ip"[1453] was denied
[ 1876.380994] module: automatic module loading of tunl0 by "ip"[1453] was denied
...
$ lsmod | grep ipip -
$

Example 2)

DCCP use after free CVE-2017-6074:
The code path can be triggered by unprivileged, using the trigger.c
program for DCCP use after free [2] and that was fixed by
commit 5edabca9d4cff7f "dccp: fix freeing skb too early for IPV6_RECVPKTINFO".

Before:
$ lsmod | grep dccp
$ strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = 3
...
$ lsmod | grep dccp
dccp_ipv6              24576  5
dccp_ipv4              24576  5 dccp_ipv6
dccp                  102400  2 dccp_ipv6,dccp_ipv4

After:
Only privileged:
$ lsmod | grep dccp
$ strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported)
...
$ lsmod | grep dccp
$ dmesg
...
[  175.945063] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1390] was denied
[  175.947952] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1390] was denied
[  175.956061] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1390] was denied
[  175.959733] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1390] was denied

$ sudo strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = 3
...
$ lsmod | grep dccp
dccp_ipv6              24576  6
dccp_ipv4              24576  5 dccp_ipv6
dccp                  102400  2 dccp_ipv6,dccp_ipv4

Disable automatic module loading:
$ lsmod | grep dccp
$ su - root
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported)
...
$ lsmod | grep dccp
$ dmesg
...
[  126.596545] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1291] was denied
[  126.598800] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1291] was denied
[  126.601264] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1291] was denied
[  126.602839] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1291] was denied

As an example, this blocks abuses, DCCP still can be explicilty loaded by
an administrator using modprobe, at same time automatic module loading is
disabled forever.

[1] http://www.openwall.com/lists/oss-security/2017/02/22/3
[2] https://github.com/xairy/kernel-exploits/tree/master/CVE-2017-6074
[3] http://www.openwall.com/lists/oss-security/2017/03/29/2
[4] http://www.openwall.com/lists/oss-security/2017/03/07/6
[5] https://a13xp0p0v.github.io/2017/03/24/CVE-2017-2636.html

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Cc: Solar Designer <solar@openwall.com>
Cc: Andy Lutomirski <luto@kernel.org>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
---
 Documentation/sysctl/kernel.txt | 54 +++++++++++++++++++++++++++
 include/linux/module.h          | 11 +++++-
 kernel/module.c                 | 81 ++++++++++++++++++++++++++++++++++++++++-
 kernel/sysctl.c                 | 28 ++++++++++++++
 4 files changed, 172 insertions(+), 2 deletions(-)

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 694968c..dc44075 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -43,6 +43,7 @@ show up in /proc/sys/kernel:
 - l2cr                        [ PPC only ]
 - modprobe                    ==> Documentation/debugging-modules.txt
 - modules_disabled
+- modules_autoload_mode
 - msg_next_id		      [ sysv ipc ]
 - msgmax
 - msgmnb
@@ -413,6 +414,59 @@ to false.  Generally used with the "kexec_load_disabled" toggle.
 
 ==============================================================
 
+modules_autoload_mode:
+
+A sysctl to control if modules auto-load feature is allowed or not.
+This sysctl complements "modules_disabled" which is for all module
+operations where this flag applies only to automatic module loading.
+Automatic module loading happens when programs request a kernel
+feature that is implemented by an unloaded module, the kernel
+automatically runs the program pointed by "modprobe" sysctl in order
+to load the corresponding module.
+
+Historically, the kernel was always able to automatically load modules
+if they are not blacklisted. This is one of the most important and
+transparent operations of Linux, it allows to provide numerous other
+features as they are needed which is crucial for a better user experience.
+However, as Linux is popular now and used for different appliances some
+of these may need to control such operations. For such systems, recent
+needs showed that in some cases allowing to control automatic module
+loading is as important as the operation itself. Restricting unprivileged
+programs or attackers that abuse this feature to load unused modules or
+modules that contain bugs is a significant security measure.
+
+The three modes that "modules_autoload_mode" support allow to provide
+restrictions on automatic module loading without breaking user
+experience.
+
+When modules_autoload_mode is set to (0), the default, there are no
+restrictions.
+
+When modules_autoload_mode is set to (1), processes must have
+CAP_SYS_MODULE to be able to trigger a module auto-load operation,
+CAP_NET_ADMIN for modules with a 'netdev-%s' alias, or other
+capabilities for specific aliased modules.
+
+When modules_autoload_mode is set to (2), automatic module loading
+is disabled for all.
+
+
+Notes on relation between "modules_disabled=0" and
+"modules_autoload_mode=2":
+1) Once "modules_disabled=1" set, it needs a reboot to undo the
+setting.
+2) Restricting automatic module loading does not interfere with
+explicit module load or unload operations.
+3) New features provided by modules can be made available without
+rebooting the system.
+4) A bad version of a module can still be unloaded and replaced with
+a better one without rebooting the system.
+
+The idea of module auto-load restriction was inspired from grsecurity
+'GRKERNSEC_MODHARDEN' config option.
+
+==============================================================
+
 msg_next_id, sem_next_id, and shm_next_id:
 
 These three toggles allows to specify desired id for next allocated IPC
diff --git a/include/linux/module.h b/include/linux/module.h
index 5cbb239..c36aed8 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -261,7 +261,16 @@ struct notifier_block;
 
 #ifdef CONFIG_MODULES
 
-extern int modules_disabled; /* for sysctl */
+enum {
+	MODULES_AUTOLOAD_ALLOWED	= 0,
+	MODULES_AUTOLOAD_PRIVILEGED	= 1,
+	MODULES_AUTOLOAD_DISABLED	= 2,
+};
+
+extern int modules_disabled; /* sysctl for explicit module load/unload */
+extern int modules_autoload_mode; /* sysctl for automatic module loading */
+extern const int modules_autoload_max; /* max value for modules_autoload_mode */
+
 /* Get/put a kernel symbol (calls must be symmetric) */
 void *__symbol_get(const char *symbol);
 void *__symbol_get_gpl(const char *symbol);
diff --git a/kernel/module.c b/kernel/module.c
index 3380d39..a7205fb 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -290,6 +290,8 @@ EXPORT_SYMBOL(is_module_sig_enforced);
 
 /* Block module loading/unloading? */
 int modules_disabled = 0;
+int modules_autoload_mode = MODULES_AUTOLOAD_ALLOWED;
+const int modules_autoload_max = MODULES_AUTOLOAD_DISABLED;
 core_param(nomodule, modules_disabled, bint, 0);
 
 /* Waiting for a module to finish initializing? */
@@ -4355,12 +4357,89 @@ EXPORT_SYMBOL_GPL(__module_text_address);
  * modules, some of them are not updated often and may contain bugs and
  * vulnerabilities.
  *
+ * If "@required_cap" is positive and a valid capability then it is checked
+ * together with the "@kmod_prefix" to either allow or deny automatic module
+ * loading.
+ *
+ * However even if the caller has the required capability, the operation can
+ * still be denied due to the global "modules_autoload_mode" sysctl mode. Unless
+ * set by enduser, the operation is always allowed which is the default.
+ *
+ * The permission check is performed in this order:
+ * 1) If the global sysctl "modules_autoload_mode" is set to 'disabled', then
+ *    operation is denied.
+ *
+ * 2) If the global sysctl "modules_autoload_mode" is set to 'privileged', then:
+ *
+ *   2.1) If "@required_cap" is positive and "@kmod_prefix" is set, then
+ *   if the caller has the capability, the operation is allowed.
+ *
+ *   2.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we
+ *   fallback to check if caller has CAP_SYS_MODULE, if so, operation is
+ *   allowed.
+ *
+ *   2.3) If caller passes "@required_cap" as a negative then we fallback to
+ *   check if caller has CAP_SYS_MODULE, if so, operation is allowed.
+ *
+ *   We require capabilities to autoload modules here, and CAP_SYS_MODULE here is
+ *   the default.
+ *
+ *   2.4) Otherwise operation is denied.
+ *
+ * 3) If the global sysctl "modules_autoload_mode" is set to 'allowed' which is
+ *    the default, then:
+ *
+ *   3.1) If "@required_cap" is positive and "@kmod_prefix" is set, we check if
+ *   caller has the capability, if so, operation is allowed.
+ *   In this case the calling subsystem requires the capability to be set before
+ *   allowing modules autoload operations and we have to honor that.
+ *
+ *   3.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we
+ *   fallback to check if caller has CAP_SYS_MODULE, if so, operation is
+ *   allowed.
+ *
+ *   3.3) If caller passes "@required_cap" as a negative then operation is
+ *   allowed. This is the most common case as it is used now by
+ *   request_module() function.
+ *
+ *   3.4) Otherwise operation is denied.
+ *
  * Returns 0 if the module request is allowed or -EPERM if not.
  */
 int may_autoload_module(char *kmod_name, int required_cap,
 			const char *kmod_prefix)
 {
-	return 0;
+	int module_require_cap = CAP_SYS_MODULE;
+	unsigned int autoload = modules_autoload_mode;
+
+	/* Short-cut for most use cases where kmod auto-loading is allowed */
+	if (autoload == MODULES_AUTOLOAD_ALLOWED && required_cap < 0)
+		return 0;
+
+	/* If autoload is disabled then fail here */
+	if (autoload == MODULES_AUTOLOAD_DISABLED)
+		return -EPERM;
+
+	/* If caller requires privileges */
+	if (required_cap > 0) {
+		/*
+		 * If '@kmod_prefix' is set then use the '@required_cap'.
+		 * This allows to cover 'netdev-%s' alias modules and others
+		 * with their corresponding capability
+		 */
+		if (kmod_prefix != NULL && *kmod_prefix != '\0')
+			module_require_cap = required_cap;
+	}
+
+	/*
+	 * We require privileges if '@required_cap' was set or if the
+	 * 'modules_autoload_mode' is set to 'privileged' mode.
+	 */
+	if (capable(module_require_cap))
+		return 0;
+
+	/* Otherwise fail */
+	return -EPERM;
 }
 
 /* Don't grab lock, we're oopsing. */
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 2fb4e27..0b6f0c8 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -207,6 +207,11 @@ static int proc_taint(struct ctl_table *table, int write,
 			       void __user *buffer, size_t *lenp, loff_t *ppos);
 #endif
 
+#ifdef CONFIG_MODULES
+static int modules_autoload_dointvec_minmax(struct ctl_table *table, int write,
+				void __user *buffer, size_t *lenp, loff_t *ppos);
+#endif
+
 #ifdef CONFIG_PRINTK
 static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write,
 				void __user *buffer, size_t *lenp, loff_t *ppos);
@@ -683,6 +688,15 @@ static struct ctl_table kern_table[] = {
 		.extra1		= &one,
 		.extra2		= &one,
 	},
+	{
+		.procname	= "modules_autoload_mode",
+		.data		= &modules_autoload_mode,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= modules_autoload_dointvec_minmax,
+		.extra1		= &zero,
+		.extra2		= (void *)&modules_autoload_max,
+	},
 #endif
 #ifdef CONFIG_UEVENT_HELPER
 	{
@@ -2499,6 +2513,20 @@ static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write,
 }
 #endif
 
+#ifdef CONFIG_MODULES
+static int modules_autoload_dointvec_minmax(struct ctl_table *table, int write,
+				void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	/*
+	 * Only CAP_SYS_MODULE in init user namespace are allowed to change this
+	 */
+	if (write && !capable(CAP_SYS_MODULE))
+		return -EPERM;
+
+	return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+}
+#endif
+
 struct do_proc_dointvec_minmax_conv_param {
 	int *min;
 	int *max;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 266+ messages in thread

* [PATCH v5 next 4/5] modules:capabilities: add a per-task modules auto-load mode
  2017-11-27 17:18 ` Djalal Harouni
  (?)
@ 2017-11-27 17:18   ` Djalal Harouni
  -1 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 17:18 UTC (permalink / raw)
  To: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, linux-security-module,
	kernel-hardening
  Cc: Jonathan Corbet, Ingo Molnar, David S. Miller, netdev,
	Peter Zijlstra, Linus Torvalds, Djalal Harouni

Previous patches added the global sysctl "modules_autoload_mode". This patch
make it possible to support process trees, containers, and sandboxes by
providing an inherited per-task "modules_autoload_mode" flag that cannot be
re-enabled once disabled. This allows to improve automatic module loading
without affecting the rest of the system.

Why we need this ?

Usually a request to a kernel feature that is implemented by a module
that is not loaded may trigger automatic module loading feature,
allowing to transparently satisfy userspace, and provide numeours
features as they are needed. In this case an implicit kernel module load
operation happens.

In most cases to load or unload a kernel module, an explicit operation
happens where programs are required to have CAP_SYS_MODULE capability to
perform so. However, in general with implicit module loading, no
capabilities are required as automatic module loading is one of the most
important and transparent operations of Linux.

Recent vulnerabilities showed that automatic module loading can be
abused in order to expose more bugs. Some of these vulnerabilities are:

* DCCP use after free CVE-2017-6074 [1] [2]
  Unprivileged to local root PoC.

* XFRM framework CVE-2017-7184 [3]
  As advertised it seems it was used to break Ubuntu at a security
  contest.

* n_hldc CVE-2017-2636 [4] [5]
  Local privilege escalation.

* L2TPv3 CVE-2016-10200

Currently most of Linux code is in a form of modules, and not all
modules are written or maintained in the same way. In a container or
sandbox world, apps can be moved from one context to another or from
one Linux system to another one, the ability to restrict some of these
apps to load extra kernel modules will prevent exposing some kernel
interfaces that have not been updated withing such systems.

The DCCP vulnerability CVE-2017-6074 that can be triggered by
unprivileged, or CVE-2017-7184 in the XFRM framework are some recent
real examples. CVE-2017-7184 was used to break Ubuntu at a security
contest. Ubuntu is more of desktop distro, using a global switch to
disable automatic module loading will harm users. Actually this design
will always end up being ignored by such kind of systems that need to
offer a competitive and interactive solution for their users.

>From this and from observing how apps are being run, this patch
introduces a per-task "modules_autoload_mode" to restrict automatic
module loading. This offers the following advantages:

1) Allows to abstract in userspace as something like:
DenyNewFeatures=yes

2) Automatic module loading is still available to the rest of the
system.

2) It is easy to use in containers and sandboxes. DCCP example could
have been used to escape containers. The XFRM framework CVE-2017-7184
needs CAP_NET_ADMIN, but attackers may start to target CAP_NET_ADMIN,
a per-task flag will make it harder.

3) Suitable for desktop and more interactive Linux systems.

4) Will allow in future to implement a per user policy.
The user database format is old and not extensible, as discussed maybe
with a modern format we may achieve the following:

User=djalal
DenyNewFeatures=no

Which means that interactive user will be allowed to load extra
Linux features. Others, volatile accounts or guests can be easily
blocked from doing so.

5) CAP_NET_ADMIN is useful, it handles lot of operations, at same time it
started to look more like CAP_SYS_ADMIN which is overloaded. We need
CAP_NET_ADMIN, containers need it, but at same time maybe we do not
want programs running with it to load 'netdev-%s' modules. Having an
extra per-task flag allow to discharge CAP_NET_ADMIN and other
capabilities, it is clearly targeted to automatic module loading
operations and from a higher view to 'load new kernel features schema'.

Usage:
------

To set the per-task "modules_autoload_mode":

        prctl(PR_SET_MODULES_AUTOLOAD_MODE, mode, 0, 0, 0);

When a module auto-load request is triggered by current task, then the
operation has first to satisfy the per-task access mode before attempting
to implicitly load the module. Once set, this setting is inherited across
fork, clone and execve.

Prior to use, the task must call prctl(PR_SET_NO_NEW_PRIVS, 1) or run with
CAP_SYS_ADMIN privileges in its namespace.  If these are not true, -EACCES
will be returned.  This requirement ensures that unprivileged programs cannot
affect the behaviour or surprise privileged children.

The per-task "modules_autoload_mode" supports the following values:
0       There are no restrictions, usually the default unless set
        by parent.
1       The task must have CAP_SYS_MODULE to be able to trigger a
        module auto-load operation, or CAP_NET_ADMIN for modules with
        a 'netdev-%s' alias.
2       Automatic modules loading is disabled for the current task.

The mode may only be increased, never decreased, thus ensuring that once
applied, processes can never relax their setting. This make it easy for
developers and users to handle.

Note that even if the per-task "modules_autoload_mode" allows to auto-load
the corresponding modules, automatic module loading may still fail due to
the global sysctl "modules_autoload_mode". For more details please see
Documentation/sysctl/kernel.txt, section "modules_autoload_mode".

When a request to a kernel module is denied, the module name with the
corresponding process name and its pid are logged. Administrators can use
such information to explicitly load the appropriate modules.

Testing per-task or per container setup
---------------------------------------

The following tool can be used to test the feature:
https://gist.githubusercontent.com/tixxdz/cf567e4275714199a32c4a80de4ea63a/raw/13e52ea0ee65772871bcf10fb6c94fedd349f5c1/pr_modules_autoload_mode_test.c

Example 1)

Before patch:
$ lsmod | grep ipip -
$ sudo ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
$ lsmod | grep ipip -
ipip                   16384  0
tunnel4                16384  1 ipip
ip_tunnel              28672  1 ipip
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0

After patch:
Set task "modules_autoload_mode" to disabled.
$ lsmod | grep ipip -
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0
$ su - root
 # ./pr_modules_autoload_mode_test 2
 # grep Modules /proc/self/status
ModulesAutoloadMode:    2
 # ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
add tunnel "tunl0" failed: No such device
...
[  634.954652] module: automatic module loading of netdev-tunl0 by "ip"[1560] was denied
[  634.955775] module: automatic module loading of tunl0 by "ip"[1560] was denied
...

Example 2)

Sample with XFRM tunnel mode.

Before patch:
$ lsmod | grep xfrm -
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0
$ sudo ip xfrm state add src 10.0.2.100 dst 10.0.1.100 proto esp spi $id1 \
> reqid $id2 mode tunnel auth "hmac(sha256)" $key1 enc "cbc(aes)" $key2
$ lsmod | grep xfrm
xfrm4_mode_tunnel      16384  2

After patch:
Set task "modules_autoload_mode" to disabled.
$ lsmod | grep xfrm -
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0
$ su - root
 # ./pr_modules_autoload_mode_test 2
 # grep Modules /proc/self/status
ModulesAutoloadMode:    2
 # ip xfrm state add src 10.0.2.100 dst 10.0.1.100 proto esp spi $id1 \
> reqid $id2 mode tunnel auth "hmac(sha256)" $key1 enc "cbc(aes)" $key2
RTNETLINK answers: Protocol not supported
...
[ 3458.139490] module: automatic module loading of xfrm-mode-2-1 by "ip"[1506] was denied
...

Example 3)

Here we use DCCP as an example since the public PoC was against it.

DCCP use after free CVE-2017-6074 (unprivileged to local root):
The code path can be triggered by unprivileged, using the trigger.c
program for DCCP use after free [2] and that was fixed by
commit 5edabca9d4cff7f "dccp: fix freeing skb too early for IPV6_RECVPKTINFO".

Before patch:
$ lsmod | grep dccp
$ strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = 3
...
$ lsmod | grep dccp
dccp_ipv6              24576  5
dccp_ipv4              24576  5 dccp_ipv6
dccp                  102400  2 dccp_ipv6,dccp_ipv4
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0

After patch:

Set task "modules_autoload_mode" to 1, privileged mode.
$ lsmod | grep dccp
$ ./pr_set_no_new_privs
$ grep NoNewPrivs /proc/self/status
NoNewPrivs:     1
$ ./pr_modules_autoload_mode_test 1
$ grep Modules /proc/self/status
ModulesAutoloadMode:    1
$ strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported)
...
$ lsmod | grep dccp
$ dmesg
...
[ 4662.171994] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1759] was denied
[ 4662.177284] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1759] was denied
[ 4662.180181] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1759] was denied
[ 4662.181709] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1759] was denied

Now task "modules_autoload_mode" to 2, disabled mode.
$ lsmod | grep dccp
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0
$ su - root
 # ./pr_modules_autoload_mode_test 2
 # grep Modules /proc/self/status
ModulesAutoloadMode:    2
 # strace ./dccp_trigger

...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported)
...
...
[ 5154.218740] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1873] was denied
[ 5154.219828] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1873] was denied
[ 5154.221814] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1873] was denied
[ 5154.222731] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1873] was denied

As showed, this blocks automatic module loading per-task. This allows to
provide a usable system, where only some sandboxed apps or containers will be
restricted to trigger automatic module loading, other parts of the
system can continue to use the feature as it is which is the case of the
desktop and userfriendly machines.

[1] http://www.openwall.com/lists/oss-security/2017/02/22/3
[2] https://github.com/xairy/kernel-exploits/tree/master/CVE-2017-6074
[3] http://www.openwall.com/lists/oss-security/2017/03/29/2
[4] http://www.openwall.com/lists/oss-security/2017/03/07/6
[5] https://a13xp0p0v.github.io/2017/03/24/CVE-2017-2636.html

Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Solar Designer <solar@openwall.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
---
 Documentation/filesystems/proc.txt                 |   3 +
 Documentation/userspace-api/index.rst              |   1 +
 .../userspace-api/modules_autoload_mode.rst        | 116 +++++++++++++++++++++
 fs/proc/array.c                                    |   6 ++
 include/linux/init_task.h                          |   8 ++
 include/linux/module.h                             |  20 ++++
 include/linux/sched.h                              |   5 +
 include/uapi/linux/prctl.h                         |   8 ++
 kernel/module.c                                    |  83 ++++++++++++---
 security/commoncap.c                               |  36 +++++++
 10 files changed, 270 insertions(+), 16 deletions(-)
 create mode 100644 Documentation/userspace-api/modules_autoload_mode.rst

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 2a84bb3..1974cb6 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -195,6 +195,7 @@ read the file /proc/PID/status:
   CapBnd: ffffffffffffffff
   NoNewPrivs:     0
   Seccomp:        0
+  ModulesAutoloadMode:    0
   voluntary_ctxt_switches:        0
   nonvoluntary_ctxt_switches:     1
 
@@ -269,6 +270,8 @@ Table 1-2: Contents of the status files (as of 4.8)
  CapBnd                      bitmap of capabilities bounding set
  NoNewPrivs                  no_new_privs, like prctl(PR_GET_NO_NEW_PRIV, ...)
  Seccomp                     seccomp mode, like prctl(PR_GET_SECCOMP, ...)
+ ModulesAutoloadMode         modules auto-load mode, like
+                             prctl(PR_GET_MODULES_AUTOLOAD_MODE, ...)
  Cpus_allowed                mask of CPUs on which this process may run
  Cpus_allowed_list           Same as previous, but in "list format"
  Mems_allowed                mask of memory nodes allowed to this process
diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst
index 7b2eb1b..bfd51b7 100644
--- a/Documentation/userspace-api/index.rst
+++ b/Documentation/userspace-api/index.rst
@@ -17,6 +17,7 @@ place where this information is gathered.
    :maxdepth: 2
 
    no_new_privs
+   modules_autoload_mode
    seccomp_filter
    unshare
 
diff --git a/Documentation/userspace-api/modules_autoload_mode.rst b/Documentation/userspace-api/modules_autoload_mode.rst
new file mode 100644
index 0000000..1153c35
--- /dev/null
+++ b/Documentation/userspace-api/modules_autoload_mode.rst
@@ -0,0 +1,116 @@
+======================================
+Per-task module auto-load restrictions
+======================================
+
+
+Introduction
+============
+
+Usually a request to a kernel feature that is implemented by a module
+that is not loaded may trigger automatic module loading feature, allowing
+to transparently satisfy userspace, and provide numerous other features
+as they are needed. In this case an implicit kernel module load
+operation happens.
+
+In most cases to load or unload a kernel module, an explicit operation
+happens where programs are required to have ``CAP_SYS_MODULE`` capability
+to perform so. However, with implicit module loading, no capabilities are
+required, or only ``CAP_NET_ADMIN`` in rare cases where the module has the
+'netdev-%s' alias. Historically this was always the case as automatic
+module loading is one of the most important and transparent operations
+of Linux, users expect that their programs just work, yet, recent cases
+showed that this can be abused by unprivileged users or attackers to load
+modules that were not updated, or modules that contain bugs and
+vulnerabilities.
+
+Currently most of Linux code is in a form of modules, hence, allowing to
+control automatic module loading in some cases is as important as the
+operation itself, especially in the context where Linux is used in
+different appliances.
+
+Restricting automatic module loading allows administratros to have the
+appropriate time to update or deny module autoloading in advance. In a
+container or sandbox world where apps can be moved from one context to
+another, the ability to restrict some containers or apps to load extra
+kernel modules will prevent exposing some kernel interfaces that may not
+receive the same care as some other parts of the core. The DCCP vulnerability
+CVE-2017-6074 that can be triggered by unprivileged, or CVE-2017-7184
+in the XFRM framework are some real examples where users or programs are
+able to expose such kernel interfaces and escape their sandbox.
+
+The per-task ``modules_autoload_mode`` allow to restrict automatic module
+loading per task, preventing the kernel from exposing more of its
+interface. This is particularly useful for containers and sandboxes as
+noted above, they are restricted from affecting the rest of the system
+without affecting its functionality, automatic module loading is still
+available for others.
+
+
+Usage
+=====
+
+When the kernel is compiled with modules support ``CONFIG_MODULES``, then:
+
+``PR_SET_MODULES_AUTOLOAD_MODE``:
+        Set the current task ``modules_autoload_mode``. When a module
+        auto-load request is triggered by current task, then the
+        operation has first to satisfy the per-task access mode before
+        attempting to implicitly load the module. As an example,
+        automatic loading of modules that contain bugs or vulnerabilities
+        can be restricted and unprivileged users can no longer abuse such
+        interfaces. Once set, this setting is inherited across ``fork(2)``,
+        ``clone(2)`` and ``execve(2)``.
+
+        Prior to use, the task must call ``prctl(PR_SET_NO_NEW_PRIVS, 1)``
+        or run with ``CAP_SYS_ADMIN`` privileges in its namespace.  If
+        these are not true, ``-EACCES`` will be returned.  This requirement
+        ensures that unprivileged programs cannot affect the behaviour or
+        surprise privileged children.
+
+        Usage:
+                ``prctl(PR_SET_MODULES_AUTOLOAD_MODE, mode, 0, 0, 0);``
+
+        The 'mode' argument supports the following values:
+        0       There are no restrictions, usually the default unless set
+                by parent.
+        1       The task must have ``CAP_SYS_MODULE`` to be able to trigger a
+                module auto-load operation, or ``CAP_NET_ADMIN`` for modules
+                with a 'netdev-%s' alias.
+        2       Automatic modules loading is disabled for the current task.
+
+        The mode may only be increased, never decreased, thus ensuring
+        that once applied, processes can never relax their setting.
+
+
+        Returned values:
+        0               On success.
+        ``-EINVAL``     If 'mode' is not valid, or the operation is not
+                        supported.
+        ``-EACCES``     If task does not have ``CAP_SYS_ADMIN`` in its namespace
+                        or is not running with ``no_new_privs``.
+        ``-EPERM``      If 'mode' is less strict than current task
+                        ``modules_autoload_mode``.
+
+
+        Note that even if the per-task ``modules_autoload_mode`` allows to
+        auto-load the corresponding modules, automatic module loading
+        may still fail due to the global sysctl ``modules_autoload_mode``.
+        The default mode of ``modules_autoload_mode`` is to always allow
+        automatic module loading. For more details, please see
+        Documentation/sysctl/kernel.txt, section "modules_autoload_mode".
+
+
+        When a request to a kernel module is denied, the module name with the
+        corresponding process name and its pid are logged. Administrators can
+        use such information to explicitly load the appropriate modules.
+
+
+``PR_GET_MODULES_AUTOLOAD_MODE``:
+        Return the current task ``modules_autoload_mode``.
+
+        Usage:
+                ``prctl(PR_GET_MODULES_AUTOLOAD_MODE, 0, 0, 0, 0);``
+
+        Returned values:
+        mode            The task's ``modules_autoload_mode``
+        ``-ENOSYS``     If the kernel was compiled without ``CONFIG_MODULES``.
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 79375fc..57b6cc5 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -90,6 +90,7 @@
 #include <linux/string_helpers.h>
 #include <linux/user_namespace.h>
 #include <linux/fs_struct.h>
+#include <linux/module.h>
 
 #include <asm/pgtable.h>
 #include <asm/processor.h>
@@ -343,10 +344,15 @@ static inline void task_cap(struct seq_file *m, struct task_struct *p)
 
 static inline void task_seccomp(struct seq_file *m, struct task_struct *p)
 {
+	int autoload = task_modules_autoload_mode(p);
+
 	seq_put_decimal_ull(m, "NoNewPrivs:\t", task_no_new_privs(p));
 #ifdef CONFIG_SECCOMP
 	seq_put_decimal_ull(m, "\nSeccomp:\t", p->seccomp.mode);
 #endif
+	if (autoload != -ENOSYS)
+		seq_put_decimal_ull(m, "\nModulesAutoloadMode:\t", autoload);
+
 	seq_putc(m, '\n');
 }
 
diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 6a53262..f564b41 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -153,6 +153,13 @@ extern struct cred init_cred;
 # define INIT_CGROUP_SCHED(tsk)
 #endif
 
+#ifdef CONFIG_MODULES
+# define INIT_MODULES_AUTOLOAD_MODE(tsk)				\
+	.modules_autoload_mode = 0,
+#else
+# define INIT_MODULES_AUTOLOAD_MODE(tsk)
+#endif
+
 #ifdef CONFIG_PERF_EVENTS
 # define INIT_PERF_EVENTS(tsk)						\
 	.perf_event_mutex = 						\
@@ -250,6 +257,7 @@ extern struct cred init_cred;
 	.tasks		= LIST_HEAD_INIT(tsk.tasks),			\
 	INIT_PUSHABLE_TASKS(tsk)					\
 	INIT_CGROUP_SCHED(tsk)						\
+	INIT_MODULES_AUTOLOAD_MODE(tsk)					\
 	.ptraced	= LIST_HEAD_INIT(tsk.ptraced),			\
 	.ptrace_entry	= LIST_HEAD_INIT(tsk.ptrace_entry),		\
 	.real_parent	= &tsk,						\
diff --git a/include/linux/module.h b/include/linux/module.h
index c36aed8..1d742d3 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -13,6 +13,7 @@
 #include <linux/kmod.h>
 #include <linux/init.h>
 #include <linux/elf.h>
+#include <linux/sched.h>
 #include <linux/stringify.h>
 #include <linux/kobject.h>
 #include <linux/moduleparam.h>
@@ -510,6 +511,15 @@ bool is_module_text_address(unsigned long addr);
 int may_autoload_module(char *kmod_name, int required_cap,
 			const char *kmod_prefix);
 
+/* Set 'modules_autoload_mode' of current task */
+int task_set_modules_autoload_mode(unsigned long value);
+
+/* Read task's 'modules_autoload_mode' */
+static inline int task_modules_autoload_mode(struct task_struct *task)
+{
+	return task->modules_autoload_mode;
+}
+
 static inline bool within_module_core(unsigned long addr,
 				      const struct module *mod)
 {
@@ -662,6 +672,16 @@ static inline int may_autoload_module(char *kmod_name, int required_cap,
 	return -ENOSYS;
 }
 
+static inline int task_set_modules_autoload_mode(unsigned long value)
+{
+	return -ENOSYS;
+}
+
+static inline int task_modules_autoload_mode(struct task_struct *task)
+{
+	return -ENOSYS;
+}
+
 static inline struct module *__module_address(unsigned long addr)
 {
 	return NULL;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index e5a2fbc..1b8cf78 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -658,6 +658,11 @@ struct task_struct {
 
 	struct restart_block		restart_block;
 
+#ifdef CONFIG_MODULES
+	/* per-task modules auto-load mode */
+	unsigned			modules_autoload_mode:2;
+#endif
+
 	pid_t				pid;
 	pid_t				tgid;
 
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 3165863..5baf9ae 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -211,4 +211,12 @@ struct prctl_mm_map {
 #define PR_SET_PDEATHSIG_PROC		48
 #define PR_GET_PDEATHSIG_PROC		49
 
+/*
+ * Control the per-task modules auto-load mode
+ *
+ * See Documentation/prctl/modules_autoload_mode.txt for more details.
+ */
+#define PR_SET_MODULES_AUTOLOAD_MODE	50
+#define PR_GET_MODULES_AUTOLOAD_MODE	51
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/module.c b/kernel/module.c
index a7205fb..5c24ac4b 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -4345,6 +4345,7 @@ EXPORT_SYMBOL_GPL(__module_text_address);
 /**
  * may_autoload_module - Determine whether a module auto-load operation
  * is permitted
+ *
  * @kmod_name: The module name
  * @required_cap: if positive, may allow to auto-load the module if this
  *	capability is set
@@ -4362,47 +4363,51 @@ EXPORT_SYMBOL_GPL(__module_text_address);
  * loading.
  *
  * However even if the caller has the required capability, the operation can
- * still be denied due to the global "modules_autoload_mode" sysctl mode. Unless
- * set by enduser, the operation is always allowed which is the default.
+ * still be denied due to the per-task "modules_autoload_mode" mode and the
+ * global "modules_autoload_mode" sysctl one. Unless set by enduser, the
+ * operation is always allowed which is the default.
  *
  * The permission check is performed in this order:
- * 1) If the global sysctl "modules_autoload_mode" is set to 'disabled', then
- *    operation is denied.
+ * 1) We calculate the strict mode of both:
+ *    per-task 'modules_autoload_mode' and global sysctl 'modules_autoload_mode'
+ *
+ * We follow up with the result mode as "modules_autoload_mode":
  *
- * 2) If the global sysctl "modules_autoload_mode" is set to 'privileged', then:
+ * 2) If "modules_autoload_mode" is set to 'disabled', then operation is denied.
  *
- *   2.1) If "@required_cap" is positive and "@kmod_prefix" is set, then
+ * 3) If "modules_autoload_mode" is set to 'privileged', then:
+ *
+ *   3.1) If "@required_cap" is positive and "@kmod_prefix" is set, then
  *   if the caller has the capability, the operation is allowed.
  *
- *   2.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we
+ *   3.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we
  *   fallback to check if caller has CAP_SYS_MODULE, if so, operation is
  *   allowed.
  *
- *   2.3) If caller passes "@required_cap" as a negative then we fallback to
+ *   3.3) If caller passes "@required_cap" as a negative then we fallback to
  *   check if caller has CAP_SYS_MODULE, if so, operation is allowed.
  *
  *   We require capabilities to autoload modules here, and CAP_SYS_MODULE here is
  *   the default.
  *
- *   2.4) Otherwise operation is denied.
+ *   3.4) Otherwise operation is denied.
  *
- * 3) If the global sysctl "modules_autoload_mode" is set to 'allowed' which is
- *    the default, then:
+ * 4) If "modules_autoload_mode" is set to 'allowed' which is the default, then:
  *
- *   3.1) If "@required_cap" is positive and "@kmod_prefix" is set, we check if
+ *   4.1) If "@required_cap" is positive and "@kmod_prefix" is set, we check if
  *   caller has the capability, if so, operation is allowed.
  *   In this case the calling subsystem requires the capability to be set before
  *   allowing modules autoload operations and we have to honor that.
  *
- *   3.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we
+ *   4.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we
  *   fallback to check if caller has CAP_SYS_MODULE, if so, operation is
  *   allowed.
  *
- *   3.3) If caller passes "@required_cap" as a negative then operation is
+ *   4.3) If caller passes "@required_cap" as a negative then operation is
  *   allowed. This is the most common case as it is used now by
  *   request_module() function.
  *
- *   3.4) Otherwise operation is denied.
+ *   4.4) Otherwise operation is denied.
  *
  * Returns 0 if the module request is allowed or -EPERM if not.
  */
@@ -4410,7 +4415,8 @@ int may_autoload_module(char *kmod_name, int required_cap,
 			const char *kmod_prefix)
 {
 	int module_require_cap = CAP_SYS_MODULE;
-	unsigned int autoload = modules_autoload_mode;
+	unsigned int autoload = max_t(unsigned int, modules_autoload_mode,
+				      current->modules_autoload_mode);
 
 	/* Short-cut for most use cases where kmod auto-loading is allowed */
 	if (autoload == MODULES_AUTOLOAD_ALLOWED && required_cap < 0)
@@ -4442,6 +4448,51 @@ int may_autoload_module(char *kmod_name, int required_cap,
 	return -EPERM;
 }
 
+/**
+ * task_set_modules_autoload_mode - Set per-task modules auto-load mode
+ * @value: Value to set "modules_autoload_mode" of current task
+ *
+ * Set current task "modules_autoload_mode". The task has to have
+ * CAP_SYS_ADMIN in its namespace or be running with no_new_privs. This
+ * avoids scenarios where unprivileged tasks can affect the behaviour of
+ * privilged children by restricting module or kernel features.
+ *
+ * The task's "modules_autoload_mode" may only be increased, never decreased.
+ *
+ * Returns 0 on success, -EINVAL if @value is not valid, -EACCES if task does
+ * not have CAP_SYS_ADMIN in its namespace or is not running with no_new_privs,
+ * and finally -EPERM if @value is less strict than current task
+ * "modules_autoload_mode".
+ *
+ */
+int task_set_modules_autoload_mode(unsigned long value)
+{
+	if (value > MODULES_AUTOLOAD_DISABLED)
+		return -EINVAL;
+
+	/*
+	 * To set task "modules_autoload_mode" requires that the task has
+	 * CAP_SYS_ADMIN in its namespace or be running with no_new_privs.
+	 * This avoids scenarios where unprivileged tasks can affect the
+	 * behaviour of privileged children by restricting module features.
+	 */
+	if (!task_no_new_privs(current) &&
+	    security_capable_noaudit(current_cred(), current_user_ns(),
+				     CAP_SYS_ADMIN) != 0)
+		return -EACCES;
+
+	/*
+	 * The "modules_autoload_mode" may only be increased, never decreased,
+	 * ensuring that once applied, processes can never relax their settings.
+	 */
+	if (current->modules_autoload_mode > value)
+		return -EPERM;
+	else if (current->modules_autoload_mode < value)
+		current->modules_autoload_mode = value;
+
+	return 0;
+}
+
 /* Don't grab lock, we're oopsing. */
 void print_modules(void)
 {
diff --git a/security/commoncap.c b/security/commoncap.c
index 236e573..67a235c 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -1157,6 +1157,36 @@ static int cap_prctl_drop(unsigned long cap)
 	return commit_creds(new);
 }
 
+/*
+ * Implement PR_SET_MODULES_AUTOLOAD_MODE.
+ *
+ * Returns 0 on success, -ve on error.
+ */
+static int pr_set_modules_autoload_mode(unsigned long arg2, unsigned long arg3,
+					unsigned long arg4, unsigned long arg5)
+{
+	if (arg3 || arg4 || arg5)
+		return -EINVAL;
+
+	return task_set_modules_autoload_mode(arg2);
+}
+
+/*
+ * Implement PR_GET_MODULES_AUTOLOAD_MODE.
+ *
+ * Return current task "modules_autoload_mode", -ve on error.
+ */
+static inline int pr_get_modules_autoload_mode(unsigned long arg2,
+					       unsigned long arg3,
+					       unsigned long arg4,
+					       unsigned long arg5)
+{
+	if (arg2 || arg3 || arg4 || arg5)
+		return -EINVAL;
+
+	return task_modules_autoload_mode(current);
+}
+
 /**
  * cap_task_prctl - Implement process control functions for this security module
  * @option: The process control function requested
@@ -1287,6 +1317,12 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
 			return commit_creds(new);
 		}
 
+	case PR_SET_MODULES_AUTOLOAD_MODE:
+		return pr_set_modules_autoload_mode(arg2, arg3, arg4, arg5);
+
+	case PR_GET_MODULES_AUTOLOAD_MODE:
+		return pr_get_modules_autoload_mode(arg2, arg3, arg4, arg5);
+
 	default:
 		/* No functionality available - continue with default */
 		return -ENOSYS;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 266+ messages in thread

* [PATCH v5 next 4/5] modules:capabilities: add a per-task modules auto-load mode
@ 2017-11-27 17:18   ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 17:18 UTC (permalink / raw)
  To: linux-security-module

Previous patches added the global sysctl "modules_autoload_mode". This patch
make it possible to support process trees, containers, and sandboxes by
providing an inherited per-task "modules_autoload_mode" flag that cannot be
re-enabled once disabled. This allows to improve automatic module loading
without affecting the rest of the system.

Why we need this ?

Usually a request to a kernel feature that is implemented by a module
that is not loaded may trigger automatic module loading feature,
allowing to transparently satisfy userspace, and provide numeours
features as they are needed. In this case an implicit kernel module load
operation happens.

In most cases to load or unload a kernel module, an explicit operation
happens where programs are required to have CAP_SYS_MODULE capability to
perform so. However, in general with implicit module loading, no
capabilities are required as automatic module loading is one of the most
important and transparent operations of Linux.

Recent vulnerabilities showed that automatic module loading can be
abused in order to expose more bugs. Some of these vulnerabilities are:

* DCCP use after free CVE-2017-6074 [1] [2]
  Unprivileged to local root PoC.

* XFRM framework CVE-2017-7184 [3]
  As advertised it seems it was used to break Ubuntu at a security
  contest.

* n_hldc CVE-2017-2636 [4] [5]
  Local privilege escalation.

* L2TPv3 CVE-2016-10200

Currently most of Linux code is in a form of modules, and not all
modules are written or maintained in the same way. In a container or
sandbox world, apps can be moved from one context to another or from
one Linux system to another one, the ability to restrict some of these
apps to load extra kernel modules will prevent exposing some kernel
interfaces that have not been updated withing such systems.

The DCCP vulnerability CVE-2017-6074 that can be triggered by
unprivileged, or CVE-2017-7184 in the XFRM framework are some recent
real examples. CVE-2017-7184 was used to break Ubuntu at a security
contest. Ubuntu is more of desktop distro, using a global switch to
disable automatic module loading will harm users. Actually this design
will always end up being ignored by such kind of systems that need to
offer a competitive and interactive solution for their users.

>From this and from observing how apps are being run, this patch
introduces a per-task "modules_autoload_mode" to restrict automatic
module loading. This offers the following advantages:

1) Allows to abstract in userspace as something like:
DenyNewFeatures=yes

2) Automatic module loading is still available to the rest of the
system.

2) It is easy to use in containers and sandboxes. DCCP example could
have been used to escape containers. The XFRM framework CVE-2017-7184
needs CAP_NET_ADMIN, but attackers may start to target CAP_NET_ADMIN,
a per-task flag will make it harder.

3) Suitable for desktop and more interactive Linux systems.

4) Will allow in future to implement a per user policy.
The user database format is old and not extensible, as discussed maybe
with a modern format we may achieve the following:

User=djalal
DenyNewFeatures=no

Which means that interactive user will be allowed to load extra
Linux features. Others, volatile accounts or guests can be easily
blocked from doing so.

5) CAP_NET_ADMIN is useful, it handles lot of operations, at same time it
started to look more like CAP_SYS_ADMIN which is overloaded. We need
CAP_NET_ADMIN, containers need it, but at same time maybe we do not
want programs running with it to load 'netdev-%s' modules. Having an
extra per-task flag allow to discharge CAP_NET_ADMIN and other
capabilities, it is clearly targeted to automatic module loading
operations and from a higher view to 'load new kernel features schema'.

Usage:
------

To set the per-task "modules_autoload_mode":

        prctl(PR_SET_MODULES_AUTOLOAD_MODE, mode, 0, 0, 0);

When a module auto-load request is triggered by current task, then the
operation has first to satisfy the per-task access mode before attempting
to implicitly load the module. Once set, this setting is inherited across
fork, clone and execve.

Prior to use, the task must call prctl(PR_SET_NO_NEW_PRIVS, 1) or run with
CAP_SYS_ADMIN privileges in its namespace.  If these are not true, -EACCES
will be returned.  This requirement ensures that unprivileged programs cannot
affect the behaviour or surprise privileged children.

The per-task "modules_autoload_mode" supports the following values:
0       There are no restrictions, usually the default unless set
        by parent.
1       The task must have CAP_SYS_MODULE to be able to trigger a
        module auto-load operation, or CAP_NET_ADMIN for modules with
        a 'netdev-%s' alias.
2       Automatic modules loading is disabled for the current task.

The mode may only be increased, never decreased, thus ensuring that once
applied, processes can never relax their setting. This make it easy for
developers and users to handle.

Note that even if the per-task "modules_autoload_mode" allows to auto-load
the corresponding modules, automatic module loading may still fail due to
the global sysctl "modules_autoload_mode". For more details please see
Documentation/sysctl/kernel.txt, section "modules_autoload_mode".

When a request to a kernel module is denied, the module name with the
corresponding process name and its pid are logged. Administrators can use
such information to explicitly load the appropriate modules.

Testing per-task or per container setup
---------------------------------------

The following tool can be used to test the feature:
https://gist.githubusercontent.com/tixxdz/cf567e4275714199a32c4a80de4ea63a/raw/13e52ea0ee65772871bcf10fb6c94fedd349f5c1/pr_modules_autoload_mode_test.c

Example 1)

Before patch:
$ lsmod | grep ipip -
$ sudo ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
$ lsmod | grep ipip -
ipip                   16384  0
tunnel4                16384  1 ipip
ip_tunnel              28672  1 ipip
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0

After patch:
Set task "modules_autoload_mode" to disabled.
$ lsmod | grep ipip -
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0
$ su - root
 # ./pr_modules_autoload_mode_test 2
 # grep Modules /proc/self/status
ModulesAutoloadMode:    2
 # ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
add tunnel "tunl0" failed: No such device
...
[  634.954652] module: automatic module loading of netdev-tunl0 by "ip"[1560] was denied
[  634.955775] module: automatic module loading of tunl0 by "ip"[1560] was denied
...

Example 2)

Sample with XFRM tunnel mode.

Before patch:
$ lsmod | grep xfrm -
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0
$ sudo ip xfrm state add src 10.0.2.100 dst 10.0.1.100 proto esp spi $id1 \
> reqid $id2 mode tunnel auth "hmac(sha256)" $key1 enc "cbc(aes)" $key2
$ lsmod | grep xfrm
xfrm4_mode_tunnel      16384  2

After patch:
Set task "modules_autoload_mode" to disabled.
$ lsmod | grep xfrm -
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0
$ su - root
 # ./pr_modules_autoload_mode_test 2
 # grep Modules /proc/self/status
ModulesAutoloadMode:    2
 # ip xfrm state add src 10.0.2.100 dst 10.0.1.100 proto esp spi $id1 \
> reqid $id2 mode tunnel auth "hmac(sha256)" $key1 enc "cbc(aes)" $key2
RTNETLINK answers: Protocol not supported
...
[ 3458.139490] module: automatic module loading of xfrm-mode-2-1 by "ip"[1506] was denied
...

Example 3)

Here we use DCCP as an example since the public PoC was against it.

DCCP use after free CVE-2017-6074 (unprivileged to local root):
The code path can be triggered by unprivileged, using the trigger.c
program for DCCP use after free [2] and that was fixed by
commit 5edabca9d4cff7f "dccp: fix freeing skb too early for IPV6_RECVPKTINFO".

Before patch:
$ lsmod | grep dccp
$ strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = 3
...
$ lsmod | grep dccp
dccp_ipv6              24576  5
dccp_ipv4              24576  5 dccp_ipv6
dccp                  102400  2 dccp_ipv6,dccp_ipv4
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0

After patch:

Set task "modules_autoload_mode" to 1, privileged mode.
$ lsmod | grep dccp
$ ./pr_set_no_new_privs
$ grep NoNewPrivs /proc/self/status
NoNewPrivs:     1
$ ./pr_modules_autoload_mode_test 1
$ grep Modules /proc/self/status
ModulesAutoloadMode:    1
$ strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported)
...
$ lsmod | grep dccp
$ dmesg
...
[ 4662.171994] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1759] was denied
[ 4662.177284] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1759] was denied
[ 4662.180181] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1759] was denied
[ 4662.181709] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1759] was denied

Now task "modules_autoload_mode" to 2, disabled mode.
$ lsmod | grep dccp
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0
$ su - root
 # ./pr_modules_autoload_mode_test 2
 # grep Modules /proc/self/status
ModulesAutoloadMode:    2
 # strace ./dccp_trigger

...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported)
...
...
[ 5154.218740] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1873] was denied
[ 5154.219828] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1873] was denied
[ 5154.221814] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1873] was denied
[ 5154.222731] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1873] was denied

As showed, this blocks automatic module loading per-task. This allows to
provide a usable system, where only some sandboxed apps or containers will be
restricted to trigger automatic module loading, other parts of the
system can continue to use the feature as it is which is the case of the
desktop and userfriendly machines.

[1] http://www.openwall.com/lists/oss-security/2017/02/22/3
[2] https://github.com/xairy/kernel-exploits/tree/master/CVE-2017-6074
[3] http://www.openwall.com/lists/oss-security/2017/03/29/2
[4] http://www.openwall.com/lists/oss-security/2017/03/07/6
[5] https://a13xp0p0v.github.io/2017/03/24/CVE-2017-2636.html

Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Solar Designer <solar@openwall.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
---
 Documentation/filesystems/proc.txt                 |   3 +
 Documentation/userspace-api/index.rst              |   1 +
 .../userspace-api/modules_autoload_mode.rst        | 116 +++++++++++++++++++++
 fs/proc/array.c                                    |   6 ++
 include/linux/init_task.h                          |   8 ++
 include/linux/module.h                             |  20 ++++
 include/linux/sched.h                              |   5 +
 include/uapi/linux/prctl.h                         |   8 ++
 kernel/module.c                                    |  83 ++++++++++++---
 security/commoncap.c                               |  36 +++++++
 10 files changed, 270 insertions(+), 16 deletions(-)
 create mode 100644 Documentation/userspace-api/modules_autoload_mode.rst

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 2a84bb3..1974cb6 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -195,6 +195,7 @@ read the file /proc/PID/status:
   CapBnd: ffffffffffffffff
   NoNewPrivs:     0
   Seccomp:        0
+  ModulesAutoloadMode:    0
   voluntary_ctxt_switches:        0
   nonvoluntary_ctxt_switches:     1
 
@@ -269,6 +270,8 @@ Table 1-2: Contents of the status files (as of 4.8)
  CapBnd                      bitmap of capabilities bounding set
  NoNewPrivs                  no_new_privs, like prctl(PR_GET_NO_NEW_PRIV, ...)
  Seccomp                     seccomp mode, like prctl(PR_GET_SECCOMP, ...)
+ ModulesAutoloadMode         modules auto-load mode, like
+                             prctl(PR_GET_MODULES_AUTOLOAD_MODE, ...)
  Cpus_allowed                mask of CPUs on which this process may run
  Cpus_allowed_list           Same as previous, but in "list format"
  Mems_allowed                mask of memory nodes allowed to this process
diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst
index 7b2eb1b..bfd51b7 100644
--- a/Documentation/userspace-api/index.rst
+++ b/Documentation/userspace-api/index.rst
@@ -17,6 +17,7 @@ place where this information is gathered.
    :maxdepth: 2
 
    no_new_privs
+   modules_autoload_mode
    seccomp_filter
    unshare
 
diff --git a/Documentation/userspace-api/modules_autoload_mode.rst b/Documentation/userspace-api/modules_autoload_mode.rst
new file mode 100644
index 0000000..1153c35
--- /dev/null
+++ b/Documentation/userspace-api/modules_autoload_mode.rst
@@ -0,0 +1,116 @@
+======================================
+Per-task module auto-load restrictions
+======================================
+
+
+Introduction
+============
+
+Usually a request to a kernel feature that is implemented by a module
+that is not loaded may trigger automatic module loading feature, allowing
+to transparently satisfy userspace, and provide numerous other features
+as they are needed. In this case an implicit kernel module load
+operation happens.
+
+In most cases to load or unload a kernel module, an explicit operation
+happens where programs are required to have ``CAP_SYS_MODULE`` capability
+to perform so. However, with implicit module loading, no capabilities are
+required, or only ``CAP_NET_ADMIN`` in rare cases where the module has the
+'netdev-%s' alias. Historically this was always the case as automatic
+module loading is one of the most important and transparent operations
+of Linux, users expect that their programs just work, yet, recent cases
+showed that this can be abused by unprivileged users or attackers to load
+modules that were not updated, or modules that contain bugs and
+vulnerabilities.
+
+Currently most of Linux code is in a form of modules, hence, allowing to
+control automatic module loading in some cases is as important as the
+operation itself, especially in the context where Linux is used in
+different appliances.
+
+Restricting automatic module loading allows administratros to have the
+appropriate time to update or deny module autoloading in advance. In a
+container or sandbox world where apps can be moved from one context to
+another, the ability to restrict some containers or apps to load extra
+kernel modules will prevent exposing some kernel interfaces that may not
+receive the same care as some other parts of the core. The DCCP vulnerability
+CVE-2017-6074 that can be triggered by unprivileged, or CVE-2017-7184
+in the XFRM framework are some real examples where users or programs are
+able to expose such kernel interfaces and escape their sandbox.
+
+The per-task ``modules_autoload_mode`` allow to restrict automatic module
+loading per task, preventing the kernel from exposing more of its
+interface. This is particularly useful for containers and sandboxes as
+noted above, they are restricted from affecting the rest of the system
+without affecting its functionality, automatic module loading is still
+available for others.
+
+
+Usage
+=====
+
+When the kernel is compiled with modules support ``CONFIG_MODULES``, then:
+
+``PR_SET_MODULES_AUTOLOAD_MODE``:
+        Set the current task ``modules_autoload_mode``. When a module
+        auto-load request is triggered by current task, then the
+        operation has first to satisfy the per-task access mode before
+        attempting to implicitly load the module. As an example,
+        automatic loading of modules that contain bugs or vulnerabilities
+        can be restricted and unprivileged users can no longer abuse such
+        interfaces. Once set, this setting is inherited across ``fork(2)``,
+        ``clone(2)`` and ``execve(2)``.
+
+        Prior to use, the task must call ``prctl(PR_SET_NO_NEW_PRIVS, 1)``
+        or run with ``CAP_SYS_ADMIN`` privileges in its namespace.  If
+        these are not true, ``-EACCES`` will be returned.  This requirement
+        ensures that unprivileged programs cannot affect the behaviour or
+        surprise privileged children.
+
+        Usage:
+                ``prctl(PR_SET_MODULES_AUTOLOAD_MODE, mode, 0, 0, 0);``
+
+        The 'mode' argument supports the following values:
+        0       There are no restrictions, usually the default unless set
+                by parent.
+        1       The task must have ``CAP_SYS_MODULE`` to be able to trigger a
+                module auto-load operation, or ``CAP_NET_ADMIN`` for modules
+                with a 'netdev-%s' alias.
+        2       Automatic modules loading is disabled for the current task.
+
+        The mode may only be increased, never decreased, thus ensuring
+        that once applied, processes can never relax their setting.
+
+
+        Returned values:
+        0               On success.
+        ``-EINVAL``     If 'mode' is not valid, or the operation is not
+                        supported.
+        ``-EACCES``     If task does not have ``CAP_SYS_ADMIN`` in its namespace
+                        or is not running with ``no_new_privs``.
+        ``-EPERM``      If 'mode' is less strict than current task
+                        ``modules_autoload_mode``.
+
+
+        Note that even if the per-task ``modules_autoload_mode`` allows to
+        auto-load the corresponding modules, automatic module loading
+        may still fail due to the global sysctl ``modules_autoload_mode``.
+        The default mode of ``modules_autoload_mode`` is to always allow
+        automatic module loading. For more details, please see
+        Documentation/sysctl/kernel.txt, section "modules_autoload_mode".
+
+
+        When a request to a kernel module is denied, the module name with the
+        corresponding process name and its pid are logged. Administrators can
+        use such information to explicitly load the appropriate modules.
+
+
+``PR_GET_MODULES_AUTOLOAD_MODE``:
+        Return the current task ``modules_autoload_mode``.
+
+        Usage:
+                ``prctl(PR_GET_MODULES_AUTOLOAD_MODE, 0, 0, 0, 0);``
+
+        Returned values:
+        mode            The task's ``modules_autoload_mode``
+        ``-ENOSYS``     If the kernel was compiled without ``CONFIG_MODULES``.
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 79375fc..57b6cc5 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -90,6 +90,7 @@
 #include <linux/string_helpers.h>
 #include <linux/user_namespace.h>
 #include <linux/fs_struct.h>
+#include <linux/module.h>
 
 #include <asm/pgtable.h>
 #include <asm/processor.h>
@@ -343,10 +344,15 @@ static inline void task_cap(struct seq_file *m, struct task_struct *p)
 
 static inline void task_seccomp(struct seq_file *m, struct task_struct *p)
 {
+	int autoload = task_modules_autoload_mode(p);
+
 	seq_put_decimal_ull(m, "NoNewPrivs:\t", task_no_new_privs(p));
 #ifdef CONFIG_SECCOMP
 	seq_put_decimal_ull(m, "\nSeccomp:\t", p->seccomp.mode);
 #endif
+	if (autoload != -ENOSYS)
+		seq_put_decimal_ull(m, "\nModulesAutoloadMode:\t", autoload);
+
 	seq_putc(m, '\n');
 }
 
diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 6a53262..f564b41 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -153,6 +153,13 @@ extern struct cred init_cred;
 # define INIT_CGROUP_SCHED(tsk)
 #endif
 
+#ifdef CONFIG_MODULES
+# define INIT_MODULES_AUTOLOAD_MODE(tsk)				\
+	.modules_autoload_mode = 0,
+#else
+# define INIT_MODULES_AUTOLOAD_MODE(tsk)
+#endif
+
 #ifdef CONFIG_PERF_EVENTS
 # define INIT_PERF_EVENTS(tsk)						\
 	.perf_event_mutex = 						\
@@ -250,6 +257,7 @@ extern struct cred init_cred;
 	.tasks		= LIST_HEAD_INIT(tsk.tasks),			\
 	INIT_PUSHABLE_TASKS(tsk)					\
 	INIT_CGROUP_SCHED(tsk)						\
+	INIT_MODULES_AUTOLOAD_MODE(tsk)					\
 	.ptraced	= LIST_HEAD_INIT(tsk.ptraced),			\
 	.ptrace_entry	= LIST_HEAD_INIT(tsk.ptrace_entry),		\
 	.real_parent	= &tsk,						\
diff --git a/include/linux/module.h b/include/linux/module.h
index c36aed8..1d742d3 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -13,6 +13,7 @@
 #include <linux/kmod.h>
 #include <linux/init.h>
 #include <linux/elf.h>
+#include <linux/sched.h>
 #include <linux/stringify.h>
 #include <linux/kobject.h>
 #include <linux/moduleparam.h>
@@ -510,6 +511,15 @@ bool is_module_text_address(unsigned long addr);
 int may_autoload_module(char *kmod_name, int required_cap,
 			const char *kmod_prefix);
 
+/* Set 'modules_autoload_mode' of current task */
+int task_set_modules_autoload_mode(unsigned long value);
+
+/* Read task's 'modules_autoload_mode' */
+static inline int task_modules_autoload_mode(struct task_struct *task)
+{
+	return task->modules_autoload_mode;
+}
+
 static inline bool within_module_core(unsigned long addr,
 				      const struct module *mod)
 {
@@ -662,6 +672,16 @@ static inline int may_autoload_module(char *kmod_name, int required_cap,
 	return -ENOSYS;
 }
 
+static inline int task_set_modules_autoload_mode(unsigned long value)
+{
+	return -ENOSYS;
+}
+
+static inline int task_modules_autoload_mode(struct task_struct *task)
+{
+	return -ENOSYS;
+}
+
 static inline struct module *__module_address(unsigned long addr)
 {
 	return NULL;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index e5a2fbc..1b8cf78 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -658,6 +658,11 @@ struct task_struct {
 
 	struct restart_block		restart_block;
 
+#ifdef CONFIG_MODULES
+	/* per-task modules auto-load mode */
+	unsigned			modules_autoload_mode:2;
+#endif
+
 	pid_t				pid;
 	pid_t				tgid;
 
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 3165863..5baf9ae 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -211,4 +211,12 @@ struct prctl_mm_map {
 #define PR_SET_PDEATHSIG_PROC		48
 #define PR_GET_PDEATHSIG_PROC		49
 
+/*
+ * Control the per-task modules auto-load mode
+ *
+ * See Documentation/prctl/modules_autoload_mode.txt for more details.
+ */
+#define PR_SET_MODULES_AUTOLOAD_MODE	50
+#define PR_GET_MODULES_AUTOLOAD_MODE	51
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/module.c b/kernel/module.c
index a7205fb..5c24ac4b 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -4345,6 +4345,7 @@ EXPORT_SYMBOL_GPL(__module_text_address);
 /**
  * may_autoload_module - Determine whether a module auto-load operation
  * is permitted
+ *
  * @kmod_name: The module name
  * @required_cap: if positive, may allow to auto-load the module if this
  *	capability is set
@@ -4362,47 +4363,51 @@ EXPORT_SYMBOL_GPL(__module_text_address);
  * loading.
  *
  * However even if the caller has the required capability, the operation can
- * still be denied due to the global "modules_autoload_mode" sysctl mode. Unless
- * set by enduser, the operation is always allowed which is the default.
+ * still be denied due to the per-task "modules_autoload_mode" mode and the
+ * global "modules_autoload_mode" sysctl one. Unless set by enduser, the
+ * operation is always allowed which is the default.
  *
  * The permission check is performed in this order:
- * 1) If the global sysctl "modules_autoload_mode" is set to 'disabled', then
- *    operation is denied.
+ * 1) We calculate the strict mode of both:
+ *    per-task 'modules_autoload_mode' and global sysctl 'modules_autoload_mode'
+ *
+ * We follow up with the result mode as "modules_autoload_mode":
  *
- * 2) If the global sysctl "modules_autoload_mode" is set to 'privileged', then:
+ * 2) If "modules_autoload_mode" is set to 'disabled', then operation is denied.
  *
- *   2.1) If "@required_cap" is positive and "@kmod_prefix" is set, then
+ * 3) If "modules_autoload_mode" is set to 'privileged', then:
+ *
+ *   3.1) If "@required_cap" is positive and "@kmod_prefix" is set, then
  *   if the caller has the capability, the operation is allowed.
  *
- *   2.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we
+ *   3.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we
  *   fallback to check if caller has CAP_SYS_MODULE, if so, operation is
  *   allowed.
  *
- *   2.3) If caller passes "@required_cap" as a negative then we fallback to
+ *   3.3) If caller passes "@required_cap" as a negative then we fallback to
  *   check if caller has CAP_SYS_MODULE, if so, operation is allowed.
  *
  *   We require capabilities to autoload modules here, and CAP_SYS_MODULE here is
  *   the default.
  *
- *   2.4) Otherwise operation is denied.
+ *   3.4) Otherwise operation is denied.
  *
- * 3) If the global sysctl "modules_autoload_mode" is set to 'allowed' which is
- *    the default, then:
+ * 4) If "modules_autoload_mode" is set to 'allowed' which is the default, then:
  *
- *   3.1) If "@required_cap" is positive and "@kmod_prefix" is set, we check if
+ *   4.1) If "@required_cap" is positive and "@kmod_prefix" is set, we check if
  *   caller has the capability, if so, operation is allowed.
  *   In this case the calling subsystem requires the capability to be set before
  *   allowing modules autoload operations and we have to honor that.
  *
- *   3.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we
+ *   4.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we
  *   fallback to check if caller has CAP_SYS_MODULE, if so, operation is
  *   allowed.
  *
- *   3.3) If caller passes "@required_cap" as a negative then operation is
+ *   4.3) If caller passes "@required_cap" as a negative then operation is
  *   allowed. This is the most common case as it is used now by
  *   request_module() function.
  *
- *   3.4) Otherwise operation is denied.
+ *   4.4) Otherwise operation is denied.
  *
  * Returns 0 if the module request is allowed or -EPERM if not.
  */
@@ -4410,7 +4415,8 @@ int may_autoload_module(char *kmod_name, int required_cap,
 			const char *kmod_prefix)
 {
 	int module_require_cap = CAP_SYS_MODULE;
-	unsigned int autoload = modules_autoload_mode;
+	unsigned int autoload = max_t(unsigned int, modules_autoload_mode,
+				      current->modules_autoload_mode);
 
 	/* Short-cut for most use cases where kmod auto-loading is allowed */
 	if (autoload == MODULES_AUTOLOAD_ALLOWED && required_cap < 0)
@@ -4442,6 +4448,51 @@ int may_autoload_module(char *kmod_name, int required_cap,
 	return -EPERM;
 }
 
+/**
+ * task_set_modules_autoload_mode - Set per-task modules auto-load mode
+ * @value: Value to set "modules_autoload_mode" of current task
+ *
+ * Set current task "modules_autoload_mode". The task has to have
+ * CAP_SYS_ADMIN in its namespace or be running with no_new_privs. This
+ * avoids scenarios where unprivileged tasks can affect the behaviour of
+ * privilged children by restricting module or kernel features.
+ *
+ * The task's "modules_autoload_mode" may only be increased, never decreased.
+ *
+ * Returns 0 on success, -EINVAL if @value is not valid, -EACCES if task does
+ * not have CAP_SYS_ADMIN in its namespace or is not running with no_new_privs,
+ * and finally -EPERM if @value is less strict than current task
+ * "modules_autoload_mode".
+ *
+ */
+int task_set_modules_autoload_mode(unsigned long value)
+{
+	if (value > MODULES_AUTOLOAD_DISABLED)
+		return -EINVAL;
+
+	/*
+	 * To set task "modules_autoload_mode" requires that the task has
+	 * CAP_SYS_ADMIN in its namespace or be running with no_new_privs.
+	 * This avoids scenarios where unprivileged tasks can affect the
+	 * behaviour of privileged children by restricting module features.
+	 */
+	if (!task_no_new_privs(current) &&
+	    security_capable_noaudit(current_cred(), current_user_ns(),
+				     CAP_SYS_ADMIN) != 0)
+		return -EACCES;
+
+	/*
+	 * The "modules_autoload_mode" may only be increased, never decreased,
+	 * ensuring that once applied, processes can never relax their settings.
+	 */
+	if (current->modules_autoload_mode > value)
+		return -EPERM;
+	else if (current->modules_autoload_mode < value)
+		current->modules_autoload_mode = value;
+
+	return 0;
+}
+
 /* Don't grab lock, we're oopsing. */
 void print_modules(void)
 {
diff --git a/security/commoncap.c b/security/commoncap.c
index 236e573..67a235c 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -1157,6 +1157,36 @@ static int cap_prctl_drop(unsigned long cap)
 	return commit_creds(new);
 }
 
+/*
+ * Implement PR_SET_MODULES_AUTOLOAD_MODE.
+ *
+ * Returns 0 on success, -ve on error.
+ */
+static int pr_set_modules_autoload_mode(unsigned long arg2, unsigned long arg3,
+					unsigned long arg4, unsigned long arg5)
+{
+	if (arg3 || arg4 || arg5)
+		return -EINVAL;
+
+	return task_set_modules_autoload_mode(arg2);
+}
+
+/*
+ * Implement PR_GET_MODULES_AUTOLOAD_MODE.
+ *
+ * Return current task "modules_autoload_mode", -ve on error.
+ */
+static inline int pr_get_modules_autoload_mode(unsigned long arg2,
+					       unsigned long arg3,
+					       unsigned long arg4,
+					       unsigned long arg5)
+{
+	if (arg2 || arg3 || arg4 || arg5)
+		return -EINVAL;
+
+	return task_modules_autoload_mode(current);
+}
+
 /**
  * cap_task_prctl - Implement process control functions for this security module
  * @option: The process control function requested
@@ -1287,6 +1317,12 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
 			return commit_creds(new);
 		}
 
+	case PR_SET_MODULES_AUTOLOAD_MODE:
+		return pr_set_modules_autoload_mode(arg2, arg3, arg4, arg5);
+
+	case PR_GET_MODULES_AUTOLOAD_MODE:
+		return pr_get_modules_autoload_mode(arg2, arg3, arg4, arg5);
+
 	default:
 		/* No functionality available - continue with default */
 		return -ENOSYS;
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 266+ messages in thread

* [kernel-hardening] [PATCH v5 next 4/5] modules:capabilities: add a per-task modules auto-load mode
@ 2017-11-27 17:18   ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 17:18 UTC (permalink / raw)
  To: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, linux-security-module,
	kernel-hardening
  Cc: Jonathan Corbet, Ingo Molnar, David S. Miller, netdev,
	Peter Zijlstra, Linus Torvalds, Djalal Harouni

Previous patches added the global sysctl "modules_autoload_mode". This patch
make it possible to support process trees, containers, and sandboxes by
providing an inherited per-task "modules_autoload_mode" flag that cannot be
re-enabled once disabled. This allows to improve automatic module loading
without affecting the rest of the system.

Why we need this ?

Usually a request to a kernel feature that is implemented by a module
that is not loaded may trigger automatic module loading feature,
allowing to transparently satisfy userspace, and provide numeours
features as they are needed. In this case an implicit kernel module load
operation happens.

In most cases to load or unload a kernel module, an explicit operation
happens where programs are required to have CAP_SYS_MODULE capability to
perform so. However, in general with implicit module loading, no
capabilities are required as automatic module loading is one of the most
important and transparent operations of Linux.

Recent vulnerabilities showed that automatic module loading can be
abused in order to expose more bugs. Some of these vulnerabilities are:

* DCCP use after free CVE-2017-6074 [1] [2]
  Unprivileged to local root PoC.

* XFRM framework CVE-2017-7184 [3]
  As advertised it seems it was used to break Ubuntu at a security
  contest.

* n_hldc CVE-2017-2636 [4] [5]
  Local privilege escalation.

* L2TPv3 CVE-2016-10200

Currently most of Linux code is in a form of modules, and not all
modules are written or maintained in the same way. In a container or
sandbox world, apps can be moved from one context to another or from
one Linux system to another one, the ability to restrict some of these
apps to load extra kernel modules will prevent exposing some kernel
interfaces that have not been updated withing such systems.

The DCCP vulnerability CVE-2017-6074 that can be triggered by
unprivileged, or CVE-2017-7184 in the XFRM framework are some recent
real examples. CVE-2017-7184 was used to break Ubuntu at a security
contest. Ubuntu is more of desktop distro, using a global switch to
disable automatic module loading will harm users. Actually this design
will always end up being ignored by such kind of systems that need to
offer a competitive and interactive solution for their users.

>From this and from observing how apps are being run, this patch
introduces a per-task "modules_autoload_mode" to restrict automatic
module loading. This offers the following advantages:

1) Allows to abstract in userspace as something like:
DenyNewFeatures=yes

2) Automatic module loading is still available to the rest of the
system.

2) It is easy to use in containers and sandboxes. DCCP example could
have been used to escape containers. The XFRM framework CVE-2017-7184
needs CAP_NET_ADMIN, but attackers may start to target CAP_NET_ADMIN,
a per-task flag will make it harder.

3) Suitable for desktop and more interactive Linux systems.

4) Will allow in future to implement a per user policy.
The user database format is old and not extensible, as discussed maybe
with a modern format we may achieve the following:

User=djalal
DenyNewFeatures=no

Which means that interactive user will be allowed to load extra
Linux features. Others, volatile accounts or guests can be easily
blocked from doing so.

5) CAP_NET_ADMIN is useful, it handles lot of operations, at same time it
started to look more like CAP_SYS_ADMIN which is overloaded. We need
CAP_NET_ADMIN, containers need it, but at same time maybe we do not
want programs running with it to load 'netdev-%s' modules. Having an
extra per-task flag allow to discharge CAP_NET_ADMIN and other
capabilities, it is clearly targeted to automatic module loading
operations and from a higher view to 'load new kernel features schema'.

Usage:
------

To set the per-task "modules_autoload_mode":

        prctl(PR_SET_MODULES_AUTOLOAD_MODE, mode, 0, 0, 0);

When a module auto-load request is triggered by current task, then the
operation has first to satisfy the per-task access mode before attempting
to implicitly load the module. Once set, this setting is inherited across
fork, clone and execve.

Prior to use, the task must call prctl(PR_SET_NO_NEW_PRIVS, 1) or run with
CAP_SYS_ADMIN privileges in its namespace.  If these are not true, -EACCES
will be returned.  This requirement ensures that unprivileged programs cannot
affect the behaviour or surprise privileged children.

The per-task "modules_autoload_mode" supports the following values:
0       There are no restrictions, usually the default unless set
        by parent.
1       The task must have CAP_SYS_MODULE to be able to trigger a
        module auto-load operation, or CAP_NET_ADMIN for modules with
        a 'netdev-%s' alias.
2       Automatic modules loading is disabled for the current task.

The mode may only be increased, never decreased, thus ensuring that once
applied, processes can never relax their setting. This make it easy for
developers and users to handle.

Note that even if the per-task "modules_autoload_mode" allows to auto-load
the corresponding modules, automatic module loading may still fail due to
the global sysctl "modules_autoload_mode". For more details please see
Documentation/sysctl/kernel.txt, section "modules_autoload_mode".

When a request to a kernel module is denied, the module name with the
corresponding process name and its pid are logged. Administrators can use
such information to explicitly load the appropriate modules.

Testing per-task or per container setup
---------------------------------------

The following tool can be used to test the feature:
https://gist.githubusercontent.com/tixxdz/cf567e4275714199a32c4a80de4ea63a/raw/13e52ea0ee65772871bcf10fb6c94fedd349f5c1/pr_modules_autoload_mode_test.c

Example 1)

Before patch:
$ lsmod | grep ipip -
$ sudo ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
$ lsmod | grep ipip -
ipip                   16384  0
tunnel4                16384  1 ipip
ip_tunnel              28672  1 ipip
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0

After patch:
Set task "modules_autoload_mode" to disabled.
$ lsmod | grep ipip -
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0
$ su - root
 # ./pr_modules_autoload_mode_test 2
 # grep Modules /proc/self/status
ModulesAutoloadMode:    2
 # ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
add tunnel "tunl0" failed: No such device
...
[  634.954652] module: automatic module loading of netdev-tunl0 by "ip"[1560] was denied
[  634.955775] module: automatic module loading of tunl0 by "ip"[1560] was denied
...

Example 2)

Sample with XFRM tunnel mode.

Before patch:
$ lsmod | grep xfrm -
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0
$ sudo ip xfrm state add src 10.0.2.100 dst 10.0.1.100 proto esp spi $id1 \
> reqid $id2 mode tunnel auth "hmac(sha256)" $key1 enc "cbc(aes)" $key2
$ lsmod | grep xfrm
xfrm4_mode_tunnel      16384  2

After patch:
Set task "modules_autoload_mode" to disabled.
$ lsmod | grep xfrm -
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0
$ su - root
 # ./pr_modules_autoload_mode_test 2
 # grep Modules /proc/self/status
ModulesAutoloadMode:    2
 # ip xfrm state add src 10.0.2.100 dst 10.0.1.100 proto esp spi $id1 \
> reqid $id2 mode tunnel auth "hmac(sha256)" $key1 enc "cbc(aes)" $key2
RTNETLINK answers: Protocol not supported
...
[ 3458.139490] module: automatic module loading of xfrm-mode-2-1 by "ip"[1506] was denied
...

Example 3)

Here we use DCCP as an example since the public PoC was against it.

DCCP use after free CVE-2017-6074 (unprivileged to local root):
The code path can be triggered by unprivileged, using the trigger.c
program for DCCP use after free [2] and that was fixed by
commit 5edabca9d4cff7f "dccp: fix freeing skb too early for IPV6_RECVPKTINFO".

Before patch:
$ lsmod | grep dccp
$ strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = 3
...
$ lsmod | grep dccp
dccp_ipv6              24576  5
dccp_ipv4              24576  5 dccp_ipv6
dccp                  102400  2 dccp_ipv6,dccp_ipv4
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0

After patch:

Set task "modules_autoload_mode" to 1, privileged mode.
$ lsmod | grep dccp
$ ./pr_set_no_new_privs
$ grep NoNewPrivs /proc/self/status
NoNewPrivs:     1
$ ./pr_modules_autoload_mode_test 1
$ grep Modules /proc/self/status
ModulesAutoloadMode:    1
$ strace ./dccp_trigger
...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported)
...
$ lsmod | grep dccp
$ dmesg
...
[ 4662.171994] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1759] was denied
[ 4662.177284] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1759] was denied
[ 4662.180181] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1759] was denied
[ 4662.181709] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1759] was denied

Now task "modules_autoload_mode" to 2, disabled mode.
$ lsmod | grep dccp
$ grep Modules /proc/self/status
ModulesAutoloadMode:    0
$ su - root
 # ./pr_modules_autoload_mode_test 2
 # grep Modules /proc/self/status
ModulesAutoloadMode:    2
 # strace ./dccp_trigger

...
socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported)
...
...
[ 5154.218740] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1873] was denied
[ 5154.219828] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1873] was denied
[ 5154.221814] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1873] was denied
[ 5154.222731] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1873] was denied

As showed, this blocks automatic module loading per-task. This allows to
provide a usable system, where only some sandboxed apps or containers will be
restricted to trigger automatic module loading, other parts of the
system can continue to use the feature as it is which is the case of the
desktop and userfriendly machines.

[1] http://www.openwall.com/lists/oss-security/2017/02/22/3
[2] https://github.com/xairy/kernel-exploits/tree/master/CVE-2017-6074
[3] http://www.openwall.com/lists/oss-security/2017/03/29/2
[4] http://www.openwall.com/lists/oss-security/2017/03/07/6
[5] https://a13xp0p0v.github.io/2017/03/24/CVE-2017-2636.html

Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Solar Designer <solar@openwall.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
---
 Documentation/filesystems/proc.txt                 |   3 +
 Documentation/userspace-api/index.rst              |   1 +
 .../userspace-api/modules_autoload_mode.rst        | 116 +++++++++++++++++++++
 fs/proc/array.c                                    |   6 ++
 include/linux/init_task.h                          |   8 ++
 include/linux/module.h                             |  20 ++++
 include/linux/sched.h                              |   5 +
 include/uapi/linux/prctl.h                         |   8 ++
 kernel/module.c                                    |  83 ++++++++++++---
 security/commoncap.c                               |  36 +++++++
 10 files changed, 270 insertions(+), 16 deletions(-)
 create mode 100644 Documentation/userspace-api/modules_autoload_mode.rst

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 2a84bb3..1974cb6 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -195,6 +195,7 @@ read the file /proc/PID/status:
   CapBnd: ffffffffffffffff
   NoNewPrivs:     0
   Seccomp:        0
+  ModulesAutoloadMode:    0
   voluntary_ctxt_switches:        0
   nonvoluntary_ctxt_switches:     1
 
@@ -269,6 +270,8 @@ Table 1-2: Contents of the status files (as of 4.8)
  CapBnd                      bitmap of capabilities bounding set
  NoNewPrivs                  no_new_privs, like prctl(PR_GET_NO_NEW_PRIV, ...)
  Seccomp                     seccomp mode, like prctl(PR_GET_SECCOMP, ...)
+ ModulesAutoloadMode         modules auto-load mode, like
+                             prctl(PR_GET_MODULES_AUTOLOAD_MODE, ...)
  Cpus_allowed                mask of CPUs on which this process may run
  Cpus_allowed_list           Same as previous, but in "list format"
  Mems_allowed                mask of memory nodes allowed to this process
diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst
index 7b2eb1b..bfd51b7 100644
--- a/Documentation/userspace-api/index.rst
+++ b/Documentation/userspace-api/index.rst
@@ -17,6 +17,7 @@ place where this information is gathered.
    :maxdepth: 2
 
    no_new_privs
+   modules_autoload_mode
    seccomp_filter
    unshare
 
diff --git a/Documentation/userspace-api/modules_autoload_mode.rst b/Documentation/userspace-api/modules_autoload_mode.rst
new file mode 100644
index 0000000..1153c35
--- /dev/null
+++ b/Documentation/userspace-api/modules_autoload_mode.rst
@@ -0,0 +1,116 @@
+======================================
+Per-task module auto-load restrictions
+======================================
+
+
+Introduction
+============
+
+Usually a request to a kernel feature that is implemented by a module
+that is not loaded may trigger automatic module loading feature, allowing
+to transparently satisfy userspace, and provide numerous other features
+as they are needed. In this case an implicit kernel module load
+operation happens.
+
+In most cases to load or unload a kernel module, an explicit operation
+happens where programs are required to have ``CAP_SYS_MODULE`` capability
+to perform so. However, with implicit module loading, no capabilities are
+required, or only ``CAP_NET_ADMIN`` in rare cases where the module has the
+'netdev-%s' alias. Historically this was always the case as automatic
+module loading is one of the most important and transparent operations
+of Linux, users expect that their programs just work, yet, recent cases
+showed that this can be abused by unprivileged users or attackers to load
+modules that were not updated, or modules that contain bugs and
+vulnerabilities.
+
+Currently most of Linux code is in a form of modules, hence, allowing to
+control automatic module loading in some cases is as important as the
+operation itself, especially in the context where Linux is used in
+different appliances.
+
+Restricting automatic module loading allows administratros to have the
+appropriate time to update or deny module autoloading in advance. In a
+container or sandbox world where apps can be moved from one context to
+another, the ability to restrict some containers or apps to load extra
+kernel modules will prevent exposing some kernel interfaces that may not
+receive the same care as some other parts of the core. The DCCP vulnerability
+CVE-2017-6074 that can be triggered by unprivileged, or CVE-2017-7184
+in the XFRM framework are some real examples where users or programs are
+able to expose such kernel interfaces and escape their sandbox.
+
+The per-task ``modules_autoload_mode`` allow to restrict automatic module
+loading per task, preventing the kernel from exposing more of its
+interface. This is particularly useful for containers and sandboxes as
+noted above, they are restricted from affecting the rest of the system
+without affecting its functionality, automatic module loading is still
+available for others.
+
+
+Usage
+=====
+
+When the kernel is compiled with modules support ``CONFIG_MODULES``, then:
+
+``PR_SET_MODULES_AUTOLOAD_MODE``:
+        Set the current task ``modules_autoload_mode``. When a module
+        auto-load request is triggered by current task, then the
+        operation has first to satisfy the per-task access mode before
+        attempting to implicitly load the module. As an example,
+        automatic loading of modules that contain bugs or vulnerabilities
+        can be restricted and unprivileged users can no longer abuse such
+        interfaces. Once set, this setting is inherited across ``fork(2)``,
+        ``clone(2)`` and ``execve(2)``.
+
+        Prior to use, the task must call ``prctl(PR_SET_NO_NEW_PRIVS, 1)``
+        or run with ``CAP_SYS_ADMIN`` privileges in its namespace.  If
+        these are not true, ``-EACCES`` will be returned.  This requirement
+        ensures that unprivileged programs cannot affect the behaviour or
+        surprise privileged children.
+
+        Usage:
+                ``prctl(PR_SET_MODULES_AUTOLOAD_MODE, mode, 0, 0, 0);``
+
+        The 'mode' argument supports the following values:
+        0       There are no restrictions, usually the default unless set
+                by parent.
+        1       The task must have ``CAP_SYS_MODULE`` to be able to trigger a
+                module auto-load operation, or ``CAP_NET_ADMIN`` for modules
+                with a 'netdev-%s' alias.
+        2       Automatic modules loading is disabled for the current task.
+
+        The mode may only be increased, never decreased, thus ensuring
+        that once applied, processes can never relax their setting.
+
+
+        Returned values:
+        0               On success.
+        ``-EINVAL``     If 'mode' is not valid, or the operation is not
+                        supported.
+        ``-EACCES``     If task does not have ``CAP_SYS_ADMIN`` in its namespace
+                        or is not running with ``no_new_privs``.
+        ``-EPERM``      If 'mode' is less strict than current task
+                        ``modules_autoload_mode``.
+
+
+        Note that even if the per-task ``modules_autoload_mode`` allows to
+        auto-load the corresponding modules, automatic module loading
+        may still fail due to the global sysctl ``modules_autoload_mode``.
+        The default mode of ``modules_autoload_mode`` is to always allow
+        automatic module loading. For more details, please see
+        Documentation/sysctl/kernel.txt, section "modules_autoload_mode".
+
+
+        When a request to a kernel module is denied, the module name with the
+        corresponding process name and its pid are logged. Administrators can
+        use such information to explicitly load the appropriate modules.
+
+
+``PR_GET_MODULES_AUTOLOAD_MODE``:
+        Return the current task ``modules_autoload_mode``.
+
+        Usage:
+                ``prctl(PR_GET_MODULES_AUTOLOAD_MODE, 0, 0, 0, 0);``
+
+        Returned values:
+        mode            The task's ``modules_autoload_mode``
+        ``-ENOSYS``     If the kernel was compiled without ``CONFIG_MODULES``.
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 79375fc..57b6cc5 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -90,6 +90,7 @@
 #include <linux/string_helpers.h>
 #include <linux/user_namespace.h>
 #include <linux/fs_struct.h>
+#include <linux/module.h>
 
 #include <asm/pgtable.h>
 #include <asm/processor.h>
@@ -343,10 +344,15 @@ static inline void task_cap(struct seq_file *m, struct task_struct *p)
 
 static inline void task_seccomp(struct seq_file *m, struct task_struct *p)
 {
+	int autoload = task_modules_autoload_mode(p);
+
 	seq_put_decimal_ull(m, "NoNewPrivs:\t", task_no_new_privs(p));
 #ifdef CONFIG_SECCOMP
 	seq_put_decimal_ull(m, "\nSeccomp:\t", p->seccomp.mode);
 #endif
+	if (autoload != -ENOSYS)
+		seq_put_decimal_ull(m, "\nModulesAutoloadMode:\t", autoload);
+
 	seq_putc(m, '\n');
 }
 
diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 6a53262..f564b41 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -153,6 +153,13 @@ extern struct cred init_cred;
 # define INIT_CGROUP_SCHED(tsk)
 #endif
 
+#ifdef CONFIG_MODULES
+# define INIT_MODULES_AUTOLOAD_MODE(tsk)				\
+	.modules_autoload_mode = 0,
+#else
+# define INIT_MODULES_AUTOLOAD_MODE(tsk)
+#endif
+
 #ifdef CONFIG_PERF_EVENTS
 # define INIT_PERF_EVENTS(tsk)						\
 	.perf_event_mutex = 						\
@@ -250,6 +257,7 @@ extern struct cred init_cred;
 	.tasks		= LIST_HEAD_INIT(tsk.tasks),			\
 	INIT_PUSHABLE_TASKS(tsk)					\
 	INIT_CGROUP_SCHED(tsk)						\
+	INIT_MODULES_AUTOLOAD_MODE(tsk)					\
 	.ptraced	= LIST_HEAD_INIT(tsk.ptraced),			\
 	.ptrace_entry	= LIST_HEAD_INIT(tsk.ptrace_entry),		\
 	.real_parent	= &tsk,						\
diff --git a/include/linux/module.h b/include/linux/module.h
index c36aed8..1d742d3 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -13,6 +13,7 @@
 #include <linux/kmod.h>
 #include <linux/init.h>
 #include <linux/elf.h>
+#include <linux/sched.h>
 #include <linux/stringify.h>
 #include <linux/kobject.h>
 #include <linux/moduleparam.h>
@@ -510,6 +511,15 @@ bool is_module_text_address(unsigned long addr);
 int may_autoload_module(char *kmod_name, int required_cap,
 			const char *kmod_prefix);
 
+/* Set 'modules_autoload_mode' of current task */
+int task_set_modules_autoload_mode(unsigned long value);
+
+/* Read task's 'modules_autoload_mode' */
+static inline int task_modules_autoload_mode(struct task_struct *task)
+{
+	return task->modules_autoload_mode;
+}
+
 static inline bool within_module_core(unsigned long addr,
 				      const struct module *mod)
 {
@@ -662,6 +672,16 @@ static inline int may_autoload_module(char *kmod_name, int required_cap,
 	return -ENOSYS;
 }
 
+static inline int task_set_modules_autoload_mode(unsigned long value)
+{
+	return -ENOSYS;
+}
+
+static inline int task_modules_autoload_mode(struct task_struct *task)
+{
+	return -ENOSYS;
+}
+
 static inline struct module *__module_address(unsigned long addr)
 {
 	return NULL;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index e5a2fbc..1b8cf78 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -658,6 +658,11 @@ struct task_struct {
 
 	struct restart_block		restart_block;
 
+#ifdef CONFIG_MODULES
+	/* per-task modules auto-load mode */
+	unsigned			modules_autoload_mode:2;
+#endif
+
 	pid_t				pid;
 	pid_t				tgid;
 
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 3165863..5baf9ae 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -211,4 +211,12 @@ struct prctl_mm_map {
 #define PR_SET_PDEATHSIG_PROC		48
 #define PR_GET_PDEATHSIG_PROC		49
 
+/*
+ * Control the per-task modules auto-load mode
+ *
+ * See Documentation/prctl/modules_autoload_mode.txt for more details.
+ */
+#define PR_SET_MODULES_AUTOLOAD_MODE	50
+#define PR_GET_MODULES_AUTOLOAD_MODE	51
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/module.c b/kernel/module.c
index a7205fb..5c24ac4b 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -4345,6 +4345,7 @@ EXPORT_SYMBOL_GPL(__module_text_address);
 /**
  * may_autoload_module - Determine whether a module auto-load operation
  * is permitted
+ *
  * @kmod_name: The module name
  * @required_cap: if positive, may allow to auto-load the module if this
  *	capability is set
@@ -4362,47 +4363,51 @@ EXPORT_SYMBOL_GPL(__module_text_address);
  * loading.
  *
  * However even if the caller has the required capability, the operation can
- * still be denied due to the global "modules_autoload_mode" sysctl mode. Unless
- * set by enduser, the operation is always allowed which is the default.
+ * still be denied due to the per-task "modules_autoload_mode" mode and the
+ * global "modules_autoload_mode" sysctl one. Unless set by enduser, the
+ * operation is always allowed which is the default.
  *
  * The permission check is performed in this order:
- * 1) If the global sysctl "modules_autoload_mode" is set to 'disabled', then
- *    operation is denied.
+ * 1) We calculate the strict mode of both:
+ *    per-task 'modules_autoload_mode' and global sysctl 'modules_autoload_mode'
+ *
+ * We follow up with the result mode as "modules_autoload_mode":
  *
- * 2) If the global sysctl "modules_autoload_mode" is set to 'privileged', then:
+ * 2) If "modules_autoload_mode" is set to 'disabled', then operation is denied.
  *
- *   2.1) If "@required_cap" is positive and "@kmod_prefix" is set, then
+ * 3) If "modules_autoload_mode" is set to 'privileged', then:
+ *
+ *   3.1) If "@required_cap" is positive and "@kmod_prefix" is set, then
  *   if the caller has the capability, the operation is allowed.
  *
- *   2.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we
+ *   3.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we
  *   fallback to check if caller has CAP_SYS_MODULE, if so, operation is
  *   allowed.
  *
- *   2.3) If caller passes "@required_cap" as a negative then we fallback to
+ *   3.3) If caller passes "@required_cap" as a negative then we fallback to
  *   check if caller has CAP_SYS_MODULE, if so, operation is allowed.
  *
  *   We require capabilities to autoload modules here, and CAP_SYS_MODULE here is
  *   the default.
  *
- *   2.4) Otherwise operation is denied.
+ *   3.4) Otherwise operation is denied.
  *
- * 3) If the global sysctl "modules_autoload_mode" is set to 'allowed' which is
- *    the default, then:
+ * 4) If "modules_autoload_mode" is set to 'allowed' which is the default, then:
  *
- *   3.1) If "@required_cap" is positive and "@kmod_prefix" is set, we check if
+ *   4.1) If "@required_cap" is positive and "@kmod_prefix" is set, we check if
  *   caller has the capability, if so, operation is allowed.
  *   In this case the calling subsystem requires the capability to be set before
  *   allowing modules autoload operations and we have to honor that.
  *
- *   3.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we
+ *   4.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we
  *   fallback to check if caller has CAP_SYS_MODULE, if so, operation is
  *   allowed.
  *
- *   3.3) If caller passes "@required_cap" as a negative then operation is
+ *   4.3) If caller passes "@required_cap" as a negative then operation is
  *   allowed. This is the most common case as it is used now by
  *   request_module() function.
  *
- *   3.4) Otherwise operation is denied.
+ *   4.4) Otherwise operation is denied.
  *
  * Returns 0 if the module request is allowed or -EPERM if not.
  */
@@ -4410,7 +4415,8 @@ int may_autoload_module(char *kmod_name, int required_cap,
 			const char *kmod_prefix)
 {
 	int module_require_cap = CAP_SYS_MODULE;
-	unsigned int autoload = modules_autoload_mode;
+	unsigned int autoload = max_t(unsigned int, modules_autoload_mode,
+				      current->modules_autoload_mode);
 
 	/* Short-cut for most use cases where kmod auto-loading is allowed */
 	if (autoload == MODULES_AUTOLOAD_ALLOWED && required_cap < 0)
@@ -4442,6 +4448,51 @@ int may_autoload_module(char *kmod_name, int required_cap,
 	return -EPERM;
 }
 
+/**
+ * task_set_modules_autoload_mode - Set per-task modules auto-load mode
+ * @value: Value to set "modules_autoload_mode" of current task
+ *
+ * Set current task "modules_autoload_mode". The task has to have
+ * CAP_SYS_ADMIN in its namespace or be running with no_new_privs. This
+ * avoids scenarios where unprivileged tasks can affect the behaviour of
+ * privilged children by restricting module or kernel features.
+ *
+ * The task's "modules_autoload_mode" may only be increased, never decreased.
+ *
+ * Returns 0 on success, -EINVAL if @value is not valid, -EACCES if task does
+ * not have CAP_SYS_ADMIN in its namespace or is not running with no_new_privs,
+ * and finally -EPERM if @value is less strict than current task
+ * "modules_autoload_mode".
+ *
+ */
+int task_set_modules_autoload_mode(unsigned long value)
+{
+	if (value > MODULES_AUTOLOAD_DISABLED)
+		return -EINVAL;
+
+	/*
+	 * To set task "modules_autoload_mode" requires that the task has
+	 * CAP_SYS_ADMIN in its namespace or be running with no_new_privs.
+	 * This avoids scenarios where unprivileged tasks can affect the
+	 * behaviour of privileged children by restricting module features.
+	 */
+	if (!task_no_new_privs(current) &&
+	    security_capable_noaudit(current_cred(), current_user_ns(),
+				     CAP_SYS_ADMIN) != 0)
+		return -EACCES;
+
+	/*
+	 * The "modules_autoload_mode" may only be increased, never decreased,
+	 * ensuring that once applied, processes can never relax their settings.
+	 */
+	if (current->modules_autoload_mode > value)
+		return -EPERM;
+	else if (current->modules_autoload_mode < value)
+		current->modules_autoload_mode = value;
+
+	return 0;
+}
+
 /* Don't grab lock, we're oopsing. */
 void print_modules(void)
 {
diff --git a/security/commoncap.c b/security/commoncap.c
index 236e573..67a235c 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -1157,6 +1157,36 @@ static int cap_prctl_drop(unsigned long cap)
 	return commit_creds(new);
 }
 
+/*
+ * Implement PR_SET_MODULES_AUTOLOAD_MODE.
+ *
+ * Returns 0 on success, -ve on error.
+ */
+static int pr_set_modules_autoload_mode(unsigned long arg2, unsigned long arg3,
+					unsigned long arg4, unsigned long arg5)
+{
+	if (arg3 || arg4 || arg5)
+		return -EINVAL;
+
+	return task_set_modules_autoload_mode(arg2);
+}
+
+/*
+ * Implement PR_GET_MODULES_AUTOLOAD_MODE.
+ *
+ * Return current task "modules_autoload_mode", -ve on error.
+ */
+static inline int pr_get_modules_autoload_mode(unsigned long arg2,
+					       unsigned long arg3,
+					       unsigned long arg4,
+					       unsigned long arg5)
+{
+	if (arg2 || arg3 || arg4 || arg5)
+		return -EINVAL;
+
+	return task_modules_autoload_mode(current);
+}
+
 /**
  * cap_task_prctl - Implement process control functions for this security module
  * @option: The process control function requested
@@ -1287,6 +1317,12 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
 			return commit_creds(new);
 		}
 
+	case PR_SET_MODULES_AUTOLOAD_MODE:
+		return pr_set_modules_autoload_mode(arg2, arg3, arg4, arg5);
+
+	case PR_GET_MODULES_AUTOLOAD_MODE:
+		return pr_get_modules_autoload_mode(arg2, arg3, arg4, arg5);
+
 	default:
 		/* No functionality available - continue with default */
 		return -ENOSYS;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 266+ messages in thread

* [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-27 17:18 ` Djalal Harouni
  (?)
@ 2017-11-27 17:18   ` Djalal Harouni
  -1 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 17:18 UTC (permalink / raw)
  To: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, linux-security-module,
	kernel-hardening
  Cc: Jonathan Corbet, Ingo Molnar, David S. Miller, netdev,
	Peter Zijlstra, Linus Torvalds, Djalal Harouni

This uses the new request_module_cap() facility to directly propagate
CAP_NET_ADMIN capability and the 'netdev' module prefix to the
capability subsystem as it was suggested.

We do not remove the explicit capable(CAP_NET_ADMIN) check here, but we
may remove it in future versions since it is also performed by the
capability subsystem. This allows to have a better interface where other
subsystems will just use this call and let the capability subsystem
handles the permission checks, if the modules should be loaded or not.

This is also an infrastructure fix since historically Linux always
allowed to auto-load modules without privileges, and later the net code
started to check capabilities and prefixes, adapted the CAP_NET_ADMIN
check with the 'netdev' prefix to prevent abusing the capability by
loading non-netdev modules. However from a bigger picture we want to
continue to support automatic module loading as non privileged but also
implement easy policy solutions like:

User=djalal
DenyNewFeatures=no

Which will translate to allow the interactive user djalal to load extra
Linux features. Others, volatile accounts or guests can be easily
blocked from doing so. We have introduced in previous patches the
necessary infrastructure and now with this change we start to use the
new request_module_cap() function to explicitly tell the capability
subsystem that we want to auto-load modules with CAP_NET_ADMIN if they
are prefixed.

This is also based on suggestions from Rusty Russel and Kees Cook [1]

[1] https://lkml.org/lkml/2017/4/26/735

Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Solar Designer <solar@openwall.com>
Cc: Andy Lutomirski <luto@kernel.org>
Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
---
 net/core/dev_ioctl.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/core/dev_ioctl.c b/net/core/dev_ioctl.c
index 7e690d0..fdd8560 100644
--- a/net/core/dev_ioctl.c
+++ b/net/core/dev_ioctl.c
@@ -382,8 +382,10 @@ void dev_load(struct net *net, const char *name)
 	rcu_read_unlock();
 
 	no_module = !dev;
+	/* "netdev-%s" modules are allowed if CAP_NET_ADMIN is set */
 	if (no_module && capable(CAP_NET_ADMIN))
-		no_module = request_module("netdev-%s", name);
+		no_module = request_module_cap(CAP_NET_ADMIN, "netdev",
+					       "%s", name);
 	if (no_module && capable(CAP_SYS_MODULE))
 		request_module("%s", name);
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 266+ messages in thread

* [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 17:18   ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 17:18 UTC (permalink / raw)
  To: linux-security-module

This uses the new request_module_cap() facility to directly propagate
CAP_NET_ADMIN capability and the 'netdev' module prefix to the
capability subsystem as it was suggested.

We do not remove the explicit capable(CAP_NET_ADMIN) check here, but we
may remove it in future versions since it is also performed by the
capability subsystem. This allows to have a better interface where other
subsystems will just use this call and let the capability subsystem
handles the permission checks, if the modules should be loaded or not.

This is also an infrastructure fix since historically Linux always
allowed to auto-load modules without privileges, and later the net code
started to check capabilities and prefixes, adapted the CAP_NET_ADMIN
check with the 'netdev' prefix to prevent abusing the capability by
loading non-netdev modules. However from a bigger picture we want to
continue to support automatic module loading as non privileged but also
implement easy policy solutions like:

User=djalal
DenyNewFeatures=no

Which will translate to allow the interactive user djalal to load extra
Linux features. Others, volatile accounts or guests can be easily
blocked from doing so. We have introduced in previous patches the
necessary infrastructure and now with this change we start to use the
new request_module_cap() function to explicitly tell the capability
subsystem that we want to auto-load modules with CAP_NET_ADMIN if they
are prefixed.

This is also based on suggestions from Rusty Russel and Kees Cook [1]

[1] https://lkml.org/lkml/2017/4/26/735

Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Solar Designer <solar@openwall.com>
Cc: Andy Lutomirski <luto@kernel.org>
Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
---
 net/core/dev_ioctl.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/core/dev_ioctl.c b/net/core/dev_ioctl.c
index 7e690d0..fdd8560 100644
--- a/net/core/dev_ioctl.c
+++ b/net/core/dev_ioctl.c
@@ -382,8 +382,10 @@ void dev_load(struct net *net, const char *name)
 	rcu_read_unlock();
 
 	no_module = !dev;
+	/* "netdev-%s" modules are allowed if CAP_NET_ADMIN is set */
 	if (no_module && capable(CAP_NET_ADMIN))
-		no_module = request_module("netdev-%s", name);
+		no_module = request_module_cap(CAP_NET_ADMIN, "netdev",
+					       "%s", name);
 	if (no_module && capable(CAP_SYS_MODULE))
 		request_module("%s", name);
 }
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 266+ messages in thread

* [kernel-hardening] [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 17:18   ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 17:18 UTC (permalink / raw)
  To: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, linux-security-module,
	kernel-hardening
  Cc: Jonathan Corbet, Ingo Molnar, David S. Miller, netdev,
	Peter Zijlstra, Linus Torvalds, Djalal Harouni

This uses the new request_module_cap() facility to directly propagate
CAP_NET_ADMIN capability and the 'netdev' module prefix to the
capability subsystem as it was suggested.

We do not remove the explicit capable(CAP_NET_ADMIN) check here, but we
may remove it in future versions since it is also performed by the
capability subsystem. This allows to have a better interface where other
subsystems will just use this call and let the capability subsystem
handles the permission checks, if the modules should be loaded or not.

This is also an infrastructure fix since historically Linux always
allowed to auto-load modules without privileges, and later the net code
started to check capabilities and prefixes, adapted the CAP_NET_ADMIN
check with the 'netdev' prefix to prevent abusing the capability by
loading non-netdev modules. However from a bigger picture we want to
continue to support automatic module loading as non privileged but also
implement easy policy solutions like:

User=djalal
DenyNewFeatures=no

Which will translate to allow the interactive user djalal to load extra
Linux features. Others, volatile accounts or guests can be easily
blocked from doing so. We have introduced in previous patches the
necessary infrastructure and now with this change we start to use the
new request_module_cap() function to explicitly tell the capability
subsystem that we want to auto-load modules with CAP_NET_ADMIN if they
are prefixed.

This is also based on suggestions from Rusty Russel and Kees Cook [1]

[1] https://lkml.org/lkml/2017/4/26/735

Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Solar Designer <solar@openwall.com>
Cc: Andy Lutomirski <luto@kernel.org>
Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
---
 net/core/dev_ioctl.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/core/dev_ioctl.c b/net/core/dev_ioctl.c
index 7e690d0..fdd8560 100644
--- a/net/core/dev_ioctl.c
+++ b/net/core/dev_ioctl.c
@@ -382,8 +382,10 @@ void dev_load(struct net *net, const char *name)
 	rcu_read_unlock();
 
 	no_module = !dev;
+	/* "netdev-%s" modules are allowed if CAP_NET_ADMIN is set */
 	if (no_module && capable(CAP_NET_ADMIN))
-		no_module = request_module("netdev-%s", name);
+		no_module = request_module_cap(CAP_NET_ADMIN, "netdev",
+					       "%s", name);
 	if (no_module && capable(CAP_SYS_MODULE))
 		request_module("%s", name);
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
  2017-11-27 17:18 ` Djalal Harouni
  (?)
  (?)
@ 2017-11-27 18:41   ` Linus Torvalds
  -1 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 18:41 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 9:18 AM, Djalal Harouni <tixxdz@gmail.com> wrote:
>
> The sysctl flag is available at "/proc/sys/kernel/modules_autoload_mode"
>
> When modules_autoload_mode is set to (0), the default, there are no
> restrictions.

So quick question: do we actually need this?

Yes, it may be the current default, but is it anything that people
actually depend on?

I'd have expected that most module loading comes from system actions
anyway, not normal users.

So I'd like to explore first whether it even makes sense to make a new option.

New options are bad because:

 - opt-in security isn't security at all

 - having to configure things is complex

so we should generally strive to _not_ need new random config options.

What are the real life use-cases for normal users having modules auto-load?

               Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 18:41   ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 18:41 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 9:18 AM, Djalal Harouni <tixxdz@gmail.com> wrote:
>
> The sysctl flag is available at "/proc/sys/kernel/modules_autoload_mode"
>
> When modules_autoload_mode is set to (0), the default, there are no
> restrictions.

So quick question: do we actually need this?

Yes, it may be the current default, but is it anything that people
actually depend on?

I'd have expected that most module loading comes from system actions
anyway, not normal users.

So I'd like to explore first whether it even makes sense to make a new option.

New options are bad because:

 - opt-in security isn't security at all

 - having to configure things is complex

so we should generally strive to _not_ need new random config options.

What are the real life use-cases for normal users having modules auto-load?

               Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 18:41   ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 18:41 UTC (permalink / raw)
  To: linux-security-module

On Mon, Nov 27, 2017 at 9:18 AM, Djalal Harouni <tixxdz@gmail.com> wrote:
>
> The sysctl flag is available at "/proc/sys/kernel/modules_autoload_mode"
>
> When modules_autoload_mode is set to (0), the default, there are no
> restrictions.

So quick question: do we actually need this?

Yes, it may be the current default, but is it anything that people
actually depend on?

I'd have expected that most module loading comes from system actions
anyway, not normal users.

So I'd like to explore first whether it even makes sense to make a new option.

New options are bad because:

 - opt-in security isn't security at all

 - having to configure things is complex

so we should generally strive to _not_ need new random config options.

What are the real life use-cases for normal users having modules auto-load?

               Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 18:41   ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 18:41 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 9:18 AM, Djalal Harouni <tixxdz@gmail.com> wrote:
>
> The sysctl flag is available at "/proc/sys/kernel/modules_autoload_mode"
>
> When modules_autoload_mode is set to (0), the default, there are no
> restrictions.

So quick question: do we actually need this?

Yes, it may be the current default, but is it anything that people
actually depend on?

I'd have expected that most module loading comes from system actions
anyway, not normal users.

So I'd like to explore first whether it even makes sense to make a new option.

New options are bad because:

 - opt-in security isn't security at all

 - having to configure things is complex

so we should generally strive to _not_ need new random config options.

What are the real life use-cases for normal users having modules auto-load?

               Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-27 17:18   ` Djalal Harouni
  (?)
  (?)
@ 2017-11-27 18:44     ` Linus Torvalds
  -1 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 18:44 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 9:18 AM, Djalal Harouni <tixxdz@gmail.com> wrote:
> This uses the new request_module_cap() facility to directly propagate
> CAP_NET_ADMIN capability and the 'netdev' module prefix to the
> capability subsystem as it was suggested.

This is the kind of complexity that I wonder if it's worth it at all.

Nobody sane actually uses those stupid capability bits. Have you ever
actually seen it used in real life?

They were a mistake, and we should never have done them - another case
of security people who think that complexity == security, when in
reality nobody actually wants the complexity or is willing to set it
up and manage it.

                   Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 18:44     ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 18:44 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 9:18 AM, Djalal Harouni <tixxdz@gmail.com> wrote:
> This uses the new request_module_cap() facility to directly propagate
> CAP_NET_ADMIN capability and the 'netdev' module prefix to the
> capability subsystem as it was suggested.

This is the kind of complexity that I wonder if it's worth it at all.

Nobody sane actually uses those stupid capability bits. Have you ever
actually seen it used in real life?

They were a mistake, and we should never have done them - another case
of security people who think that complexity == security, when in
reality nobody actually wants the complexity or is willing to set it
up and manage it.

                   Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 18:44     ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 18:44 UTC (permalink / raw)
  To: linux-security-module

On Mon, Nov 27, 2017 at 9:18 AM, Djalal Harouni <tixxdz@gmail.com> wrote:
> This uses the new request_module_cap() facility to directly propagate
> CAP_NET_ADMIN capability and the 'netdev' module prefix to the
> capability subsystem as it was suggested.

This is the kind of complexity that I wonder if it's worth it at all.

Nobody sane actually uses those stupid capability bits. Have you ever
actually seen it used in real life?

They were a mistake, and we should never have done them - another case
of security people who think that complexity == security, when in
reality nobody actually wants the complexity or is willing to set it
up and manage it.

                   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 18:44     ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 18:44 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 9:18 AM, Djalal Harouni <tixxdz@gmail.com> wrote:
> This uses the new request_module_cap() facility to directly propagate
> CAP_NET_ADMIN capability and the 'netdev' module prefix to the
> capability subsystem as it was suggested.

This is the kind of complexity that I wonder if it's worth it at all.

Nobody sane actually uses those stupid capability bits. Have you ever
actually seen it used in real life?

They were a mistake, and we should never have done them - another case
of security people who think that complexity == security, when in
reality nobody actually wants the complexity or is willing to set it
up and manage it.

                   Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-27 17:18   ` Djalal Harouni
  (?)
@ 2017-11-27 18:48     ` Randy Dunlap
  -1 siblings, 0 replies; 266+ messages in thread
From: Randy Dunlap @ 2017-11-27 18:48 UTC (permalink / raw)
  To: Djalal Harouni, Kees Cook, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell, linux-kernel,
	linux-security-module, kernel-hardening
  Cc: Jonathan Corbet, Ingo Molnar, David S. Miller, netdev,
	Peter Zijlstra, Linus Torvalds

Hi,

Mostly typos/spellos...


On 11/27/2017 09:18 AM, Djalal Harouni wrote:
> Cc: Serge Hallyn <serge@hallyn.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
> Suggested-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
> ---
>  include/linux/kmod.h      | 65 ++++++++++++++++++++++++++++++++++++++++++-----
>  include/linux/lsm_hooks.h |  6 ++++-
>  include/linux/security.h  |  7 +++--
>  kernel/kmod.c             | 29 ++++++++++++++++-----
>  security/security.c       |  6 +++--
>  security/selinux/hooks.c  |  3 ++-
>  6 files changed, 97 insertions(+), 19 deletions(-)
> 
> diff --git a/include/linux/kmod.h b/include/linux/kmod.h
> index 40c89ad..ccd6a1c 100644
> --- a/include/linux/kmod.h
> +++ b/include/linux/kmod.h
> @@ -33,16 +33,67 @@

> +/**
> + * request_module  Try to load a kernel module
> + *
> + * Automatically loads the request module.
> + *
> + * @mod...: The module name
> + */

what are the "..." for?  what do they do here?

> +#define request_module(mod...) __request_module(true, -1, NULL, mod)
> +
> +#define request_module_nowait(mod...) __request_module(false, -1, NULL, mod)
> +
> +/**
> + * request_module_cap  Load kernel module only if the required capability is set
> + *
> + * Automatically load a module if the required capability is set and it
> + * corresponds to the appropriate subsystem that is indicated by prefix.
> + *
> + * This allows to load aliased modules like 'netdev-%s' with CAP_NET_ADMIN.
> + *
> + * ex:
> + *	request_module_cap(CAP_NET_ADMIN, "netdev", "%s", mod);
> + *
> + * @required_cap: Required capability to load the module
> + * @prefix: The module prefix if any, otherwise NULL
> + * @fmt: printf style format string for the name of the module with its
> + *       arguments if any
> + *
> + * If '@required_cap' is positive, the security subsystem will check if
> + * '@prefix' is set and if caller has the required capability then the
> + * operation is allowed.
> + * The security subsystem can not make assumption about the boundaries
> + * of other subsystems, it is their responsability to make a call with

                                       responsibility

> + * the right capability and module alias.
> + *
> + * If '@required_cap' is positive and '@prefix' is NULL then we assume
> + * that the '@required_cap' is CAP_SYS_MODULE.
> + *
> + * If '@required_cap' is negative then there are no permission checks, this
> + * is the equivalent to request_module() function.
> + *
> + * This function trust callers to pass the right capability with the

                    trusts

> + * appropriate prefix.
> + *
> + * Note: the permission checks may still fail, even if the required
> + * capability is negative, this is due to module loading restrictions

                    negative; this

> + * that are controlled by the enduser.
> + */
> +#define request_module_cap(required_cap, prefix, fmt...) \
> +	__request_module(true, required_cap, prefix, fmt)
> +
>  #endif /* __LINUX_KMOD_H__ */
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index 7161d8e..d898bd0 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -571,6 +571,9 @@
>   *	Ability to trigger the kernel to automatically upcall to userspace for
>   *	userspace to load a kernel module with the given name.
>   *	@kmod_name name of the module requested by the kernel
> + *	@required_cap If positive, the required capability to automatically load
> + *	the correspondig kernel module.

            corresponding

> + *	@prefix The prefix of the module if any. Can be NULL.
>   *	Return 0 if successful.
>   * @kernel_read_file:
>   *	Read a file specified by userspace.

> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index bc6addd..679d401 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -109,6 +109,8 @@ static int call_modprobe(char *module_name, int wait)
>  /**
>   * __request_module - try to load a kernel module
>   * @wait: wait (or not) for the operation to complete
> + * @required_cap: required capability to load the module
> + * @prefix: module prefix if any otherwise NULL
>   * @fmt: printf style format string for the name of the module
>   * @...: arguments as specified in the format string
>   *
> @@ -119,14 +121,20 @@ static int call_modprobe(char *module_name, int wait)
>   * must check that the service they requested is now available not blindly
>   * invoke it.
>   *
> - * If module auto-loading support is disabled then this function
> - * becomes a no-operation.
> + * If "required_cap" is positive, The security subsystem will trust the caller

                                     the

> + * that if it has "required_cap" then it may allow to load some modules that
> + * have a specific alias.
> + *
> + * If module auto-loading support is disabled by clearing the modprobe path
> + * then this function becomes a no-operation.
>   */
> -int __request_module(bool wait, const char *fmt, ...)
> +int __request_module(bool wait, int required_cap,
> +		     const char *prefix, const char *fmt, ...)
>  {
>  	va_list args;
>  	char module_name[MODULE_NAME_LEN];
>  	int ret;
> +	int len = 0;
>  
>  	/*
>  	 * We don't allow synchronous module loading from async.  Module
> @@ -139,13 +147,22 @@ int __request_module(bool wait, const char *fmt, ...)
>  	if (!modprobe_path[0])
>  		return 0;
>  
> +	/*
> +	 * Lets attach the prefix to the module name

Let's
but better:
	 * Attach the prefix to the module name

> +	 */
> +	if (prefix != NULL && *prefix != '\0') {
> +		len += snprintf(module_name, MODULE_NAME_LEN, "%s-", prefix);
> +		if (len >= MODULE_NAME_LEN)
> +			return -ENAMETOOLONG;
> +	}
> +
>  	va_start(args, fmt);
> -	ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
> +	ret = vsnprintf(module_name + len, MODULE_NAME_LEN - len, fmt, args);
>  	va_end(args);
> -	if (ret >= MODULE_NAME_LEN)
> +	if (ret >= MODULE_NAME_LEN - len)
>  		return -ENAMETOOLONG;
>  
> -	ret = security_kernel_module_request(module_name);
> +	ret = security_kernel_module_request(module_name, required_cap, prefix);
>  	if (ret)
>  		return ret;
>  


-- 
~Randy

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-27 18:48     ` Randy Dunlap
  0 siblings, 0 replies; 266+ messages in thread
From: Randy Dunlap @ 2017-11-27 18:48 UTC (permalink / raw)
  To: linux-security-module

Hi,

Mostly typos/spellos...


On 11/27/2017 09:18 AM, Djalal Harouni wrote:
> Cc: Serge Hallyn <serge@hallyn.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
> Suggested-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
> ---
>  include/linux/kmod.h      | 65 ++++++++++++++++++++++++++++++++++++++++++-----
>  include/linux/lsm_hooks.h |  6 ++++-
>  include/linux/security.h  |  7 +++--
>  kernel/kmod.c             | 29 ++++++++++++++++-----
>  security/security.c       |  6 +++--
>  security/selinux/hooks.c  |  3 ++-
>  6 files changed, 97 insertions(+), 19 deletions(-)
> 
> diff --git a/include/linux/kmod.h b/include/linux/kmod.h
> index 40c89ad..ccd6a1c 100644
> --- a/include/linux/kmod.h
> +++ b/include/linux/kmod.h
> @@ -33,16 +33,67 @@

> +/**
> + * request_module  Try to load a kernel module
> + *
> + * Automatically loads the request module.
> + *
> + * @mod...: The module name
> + */

what are the "..." for?  what do they do here?

> +#define request_module(mod...) __request_module(true, -1, NULL, mod)
> +
> +#define request_module_nowait(mod...) __request_module(false, -1, NULL, mod)
> +
> +/**
> + * request_module_cap  Load kernel module only if the required capability is set
> + *
> + * Automatically load a module if the required capability is set and it
> + * corresponds to the appropriate subsystem that is indicated by prefix.
> + *
> + * This allows to load aliased modules like 'netdev-%s' with CAP_NET_ADMIN.
> + *
> + * ex:
> + *	request_module_cap(CAP_NET_ADMIN, "netdev", "%s", mod);
> + *
> + * @required_cap: Required capability to load the module
> + * @prefix: The module prefix if any, otherwise NULL
> + * @fmt: printf style format string for the name of the module with its
> + *       arguments if any
> + *
> + * If '@required_cap' is positive, the security subsystem will check if
> + * '@prefix' is set and if caller has the required capability then the
> + * operation is allowed.
> + * The security subsystem can not make assumption about the boundaries
> + * of other subsystems, it is their responsability to make a call with

                                       responsibility

> + * the right capability and module alias.
> + *
> + * If '@required_cap' is positive and '@prefix' is NULL then we assume
> + * that the '@required_cap' is CAP_SYS_MODULE.
> + *
> + * If '@required_cap' is negative then there are no permission checks, this
> + * is the equivalent to request_module() function.
> + *
> + * This function trust callers to pass the right capability with the

                    trusts

> + * appropriate prefix.
> + *
> + * Note: the permission checks may still fail, even if the required
> + * capability is negative, this is due to module loading restrictions

                    negative; this

> + * that are controlled by the enduser.
> + */
> +#define request_module_cap(required_cap, prefix, fmt...) \
> +	__request_module(true, required_cap, prefix, fmt)
> +
>  #endif /* __LINUX_KMOD_H__ */
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index 7161d8e..d898bd0 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -571,6 +571,9 @@
>   *	Ability to trigger the kernel to automatically upcall to userspace for
>   *	userspace to load a kernel module with the given name.
>   *	@kmod_name name of the module requested by the kernel
> + *	@required_cap If positive, the required capability to automatically load
> + *	the correspondig kernel module.

            corresponding

> + *	@prefix The prefix of the module if any. Can be NULL.
>   *	Return 0 if successful.
>   * @kernel_read_file:
>   *	Read a file specified by userspace.

> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index bc6addd..679d401 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -109,6 +109,8 @@ static int call_modprobe(char *module_name, int wait)
>  /**
>   * __request_module - try to load a kernel module
>   * @wait: wait (or not) for the operation to complete
> + * @required_cap: required capability to load the module
> + * @prefix: module prefix if any otherwise NULL
>   * @fmt: printf style format string for the name of the module
>   * @...: arguments as specified in the format string
>   *
> @@ -119,14 +121,20 @@ static int call_modprobe(char *module_name, int wait)
>   * must check that the service they requested is now available not blindly
>   * invoke it.
>   *
> - * If module auto-loading support is disabled then this function
> - * becomes a no-operation.
> + * If "required_cap" is positive, The security subsystem will trust the caller

                                     the

> + * that if it has "required_cap" then it may allow to load some modules that
> + * have a specific alias.
> + *
> + * If module auto-loading support is disabled by clearing the modprobe path
> + * then this function becomes a no-operation.
>   */
> -int __request_module(bool wait, const char *fmt, ...)
> +int __request_module(bool wait, int required_cap,
> +		     const char *prefix, const char *fmt, ...)
>  {
>  	va_list args;
>  	char module_name[MODULE_NAME_LEN];
>  	int ret;
> +	int len = 0;
>  
>  	/*
>  	 * We don't allow synchronous module loading from async.  Module
> @@ -139,13 +147,22 @@ int __request_module(bool wait, const char *fmt, ...)
>  	if (!modprobe_path[0])
>  		return 0;
>  
> +	/*
> +	 * Lets attach the prefix to the module name

Let's
but better:
	 * Attach the prefix to the module name

> +	 */
> +	if (prefix != NULL && *prefix != '\0') {
> +		len += snprintf(module_name, MODULE_NAME_LEN, "%s-", prefix);
> +		if (len >= MODULE_NAME_LEN)
> +			return -ENAMETOOLONG;
> +	}
> +
>  	va_start(args, fmt);
> -	ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
> +	ret = vsnprintf(module_name + len, MODULE_NAME_LEN - len, fmt, args);
>  	va_end(args);
> -	if (ret >= MODULE_NAME_LEN)
> +	if (ret >= MODULE_NAME_LEN - len)
>  		return -ENAMETOOLONG;
>  
> -	ret = security_kernel_module_request(module_name);
> +	ret = security_kernel_module_request(module_name, required_cap, prefix);
>  	if (ret)
>  		return ret;
>  


-- 
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-27 18:48     ` Randy Dunlap
  0 siblings, 0 replies; 266+ messages in thread
From: Randy Dunlap @ 2017-11-27 18:48 UTC (permalink / raw)
  To: Djalal Harouni, Kees Cook, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell, linux-kernel,
	linux-security-module, kernel-hardening
  Cc: Jonathan Corbet, Ingo Molnar, David S. Miller, netdev,
	Peter Zijlstra, Linus Torvalds

Hi,

Mostly typos/spellos...


On 11/27/2017 09:18 AM, Djalal Harouni wrote:
> Cc: Serge Hallyn <serge@hallyn.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
> Suggested-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
> ---
>  include/linux/kmod.h      | 65 ++++++++++++++++++++++++++++++++++++++++++-----
>  include/linux/lsm_hooks.h |  6 ++++-
>  include/linux/security.h  |  7 +++--
>  kernel/kmod.c             | 29 ++++++++++++++++-----
>  security/security.c       |  6 +++--
>  security/selinux/hooks.c  |  3 ++-
>  6 files changed, 97 insertions(+), 19 deletions(-)
> 
> diff --git a/include/linux/kmod.h b/include/linux/kmod.h
> index 40c89ad..ccd6a1c 100644
> --- a/include/linux/kmod.h
> +++ b/include/linux/kmod.h
> @@ -33,16 +33,67 @@

> +/**
> + * request_module  Try to load a kernel module
> + *
> + * Automatically loads the request module.
> + *
> + * @mod...: The module name
> + */

what are the "..." for?  what do they do here?

> +#define request_module(mod...) __request_module(true, -1, NULL, mod)
> +
> +#define request_module_nowait(mod...) __request_module(false, -1, NULL, mod)
> +
> +/**
> + * request_module_cap  Load kernel module only if the required capability is set
> + *
> + * Automatically load a module if the required capability is set and it
> + * corresponds to the appropriate subsystem that is indicated by prefix.
> + *
> + * This allows to load aliased modules like 'netdev-%s' with CAP_NET_ADMIN.
> + *
> + * ex:
> + *	request_module_cap(CAP_NET_ADMIN, "netdev", "%s", mod);
> + *
> + * @required_cap: Required capability to load the module
> + * @prefix: The module prefix if any, otherwise NULL
> + * @fmt: printf style format string for the name of the module with its
> + *       arguments if any
> + *
> + * If '@required_cap' is positive, the security subsystem will check if
> + * '@prefix' is set and if caller has the required capability then the
> + * operation is allowed.
> + * The security subsystem can not make assumption about the boundaries
> + * of other subsystems, it is their responsability to make a call with

                                       responsibility

> + * the right capability and module alias.
> + *
> + * If '@required_cap' is positive and '@prefix' is NULL then we assume
> + * that the '@required_cap' is CAP_SYS_MODULE.
> + *
> + * If '@required_cap' is negative then there are no permission checks, this
> + * is the equivalent to request_module() function.
> + *
> + * This function trust callers to pass the right capability with the

                    trusts

> + * appropriate prefix.
> + *
> + * Note: the permission checks may still fail, even if the required
> + * capability is negative, this is due to module loading restrictions

                    negative; this

> + * that are controlled by the enduser.
> + */
> +#define request_module_cap(required_cap, prefix, fmt...) \
> +	__request_module(true, required_cap, prefix, fmt)
> +
>  #endif /* __LINUX_KMOD_H__ */
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index 7161d8e..d898bd0 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -571,6 +571,9 @@
>   *	Ability to trigger the kernel to automatically upcall to userspace for
>   *	userspace to load a kernel module with the given name.
>   *	@kmod_name name of the module requested by the kernel
> + *	@required_cap If positive, the required capability to automatically load
> + *	the correspondig kernel module.

            corresponding

> + *	@prefix The prefix of the module if any. Can be NULL.
>   *	Return 0 if successful.
>   * @kernel_read_file:
>   *	Read a file specified by userspace.

> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index bc6addd..679d401 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -109,6 +109,8 @@ static int call_modprobe(char *module_name, int wait)
>  /**
>   * __request_module - try to load a kernel module
>   * @wait: wait (or not) for the operation to complete
> + * @required_cap: required capability to load the module
> + * @prefix: module prefix if any otherwise NULL
>   * @fmt: printf style format string for the name of the module
>   * @...: arguments as specified in the format string
>   *
> @@ -119,14 +121,20 @@ static int call_modprobe(char *module_name, int wait)
>   * must check that the service they requested is now available not blindly
>   * invoke it.
>   *
> - * If module auto-loading support is disabled then this function
> - * becomes a no-operation.
> + * If "required_cap" is positive, The security subsystem will trust the caller

                                     the

> + * that if it has "required_cap" then it may allow to load some modules that
> + * have a specific alias.
> + *
> + * If module auto-loading support is disabled by clearing the modprobe path
> + * then this function becomes a no-operation.
>   */
> -int __request_module(bool wait, const char *fmt, ...)
> +int __request_module(bool wait, int required_cap,
> +		     const char *prefix, const char *fmt, ...)
>  {
>  	va_list args;
>  	char module_name[MODULE_NAME_LEN];
>  	int ret;
> +	int len = 0;
>  
>  	/*
>  	 * We don't allow synchronous module loading from async.  Module
> @@ -139,13 +147,22 @@ int __request_module(bool wait, const char *fmt, ...)
>  	if (!modprobe_path[0])
>  		return 0;
>  
> +	/*
> +	 * Lets attach the prefix to the module name

Let's
but better:
	 * Attach the prefix to the module name

> +	 */
> +	if (prefix != NULL && *prefix != '\0') {
> +		len += snprintf(module_name, MODULE_NAME_LEN, "%s-", prefix);
> +		if (len >= MODULE_NAME_LEN)
> +			return -ENAMETOOLONG;
> +	}
> +
>  	va_start(args, fmt);
> -	ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
> +	ret = vsnprintf(module_name + len, MODULE_NAME_LEN - len, fmt, args);
>  	va_end(args);
> -	if (ret >= MODULE_NAME_LEN)
> +	if (ret >= MODULE_NAME_LEN - len)
>  		return -ENAMETOOLONG;
>  
> -	ret = security_kernel_module_request(module_name);
> +	ret = security_kernel_module_request(module_name, required_cap, prefix);
>  	if (ret)
>  		return ret;
>  


-- 
~Randy

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
  2017-11-27 18:41   ` Linus Torvalds
  (?)
  (?)
@ 2017-11-27 19:02     ` Linus Torvalds
  -1 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 19:02 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 10:41 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> What are the real life use-cases for normal users having modules auto-load?

Well, I could do some testing. One case is apparently both bluetoothd
and ip6tables that have been converted to indeed use CAP_NET_ADMIN.

Annoying.

Still, can't we just say "you need to have capabilities" and leave it at that?

Something (UNTESTED and whitespace damaged) like this:

  diff --git a/kernel/kmod.c b/kernel/kmod.c
  index bc6addd9152b..a3f3218f66c6 100644
  --- a/kernel/kmod.c
  +++ b/kernel/kmod.c
  @@ -139,6 +139,11 @@ int __request_module(bool wait, const char *fmt, ...)
           if (!modprobe_path[0])
                   return 0;

  +        if (WARN_ON_ONCE(!capable(CAP_SYS_MODULE) ||
  +                         !capable(CAP_SYS_ADMIN) ||
  +                         !capable(CAP_NET_ADMIN)))
  +                return -EPERM;
  +
           va_start(args, fmt);
           ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
           va_end(args);

instead of adding complex infrastructure for something people might
not want anyway?

Now, the above will not necessarily work with a legacy /dev/ directory
where al the nodes have been pre-populated, and opening the device
node is supposed to load the module. So _historically_ we did indeed
load modules as normal users. But does that really happen any more?

I'd hate to default to historical behavior if there's no actual reason to do so.

                  Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 19:02     ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 19:02 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 10:41 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> What are the real life use-cases for normal users having modules auto-load?

Well, I could do some testing. One case is apparently both bluetoothd
and ip6tables that have been converted to indeed use CAP_NET_ADMIN.

Annoying.

Still, can't we just say "you need to have capabilities" and leave it at that?

Something (UNTESTED and whitespace damaged) like this:

  diff --git a/kernel/kmod.c b/kernel/kmod.c
  index bc6addd9152b..a3f3218f66c6 100644
  --- a/kernel/kmod.c
  +++ b/kernel/kmod.c
  @@ -139,6 +139,11 @@ int __request_module(bool wait, const char *fmt, ...)
           if (!modprobe_path[0])
                   return 0;

  +        if (WARN_ON_ONCE(!capable(CAP_SYS_MODULE) ||
  +                         !capable(CAP_SYS_ADMIN) ||
  +                         !capable(CAP_NET_ADMIN)))
  +                return -EPERM;
  +
           va_start(args, fmt);
           ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
           va_end(args);

instead of adding complex infrastructure for something people might
not want anyway?

Now, the above will not necessarily work with a legacy /dev/ directory
where al the nodes have been pre-populated, and opening the device
node is supposed to load the module. So _historically_ we did indeed
load modules as normal users. But does that really happen any more?

I'd hate to default to historical behavior if there's no actual reason to do so.

                  Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 19:02     ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 19:02 UTC (permalink / raw)
  To: linux-security-module

On Mon, Nov 27, 2017 at 10:41 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> What are the real life use-cases for normal users having modules auto-load?

Well, I could do some testing. One case is apparently both bluetoothd
and ip6tables that have been converted to indeed use CAP_NET_ADMIN.

Annoying.

Still, can't we just say "you need to have capabilities" and leave it at that?

Something (UNTESTED and whitespace damaged) like this:

  diff --git a/kernel/kmod.c b/kernel/kmod.c
  index bc6addd9152b..a3f3218f66c6 100644
  --- a/kernel/kmod.c
  +++ b/kernel/kmod.c
  @@ -139,6 +139,11 @@ int __request_module(bool wait, const char *fmt, ...)
           if (!modprobe_path[0])
                   return 0;

  +        if (WARN_ON_ONCE(!capable(CAP_SYS_MODULE) ||
  +                         !capable(CAP_SYS_ADMIN) ||
  +                         !capable(CAP_NET_ADMIN)))
  +                return -EPERM;
  +
           va_start(args, fmt);
           ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
           va_end(args);

instead of adding complex infrastructure for something people might
not want anyway?

Now, the above will not necessarily work with a legacy /dev/ directory
where al the nodes have been pre-populated, and opening the device
node is supposed to load the module. So _historically_ we did indeed
load modules as normal users. But does that really happen any more?

I'd hate to default to historical behavior if there's no actual reason to do so.

                  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 19:02     ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 19:02 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 10:41 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> What are the real life use-cases for normal users having modules auto-load?

Well, I could do some testing. One case is apparently both bluetoothd
and ip6tables that have been converted to indeed use CAP_NET_ADMIN.

Annoying.

Still, can't we just say "you need to have capabilities" and leave it at that?

Something (UNTESTED and whitespace damaged) like this:

  diff --git a/kernel/kmod.c b/kernel/kmod.c
  index bc6addd9152b..a3f3218f66c6 100644
  --- a/kernel/kmod.c
  +++ b/kernel/kmod.c
  @@ -139,6 +139,11 @@ int __request_module(bool wait, const char *fmt, ...)
           if (!modprobe_path[0])
                   return 0;

  +        if (WARN_ON_ONCE(!capable(CAP_SYS_MODULE) ||
  +                         !capable(CAP_SYS_ADMIN) ||
  +                         !capable(CAP_NET_ADMIN)))
  +                return -EPERM;
  +
           va_start(args, fmt);
           ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
           va_end(args);

instead of adding complex infrastructure for something people might
not want anyway?

Now, the above will not necessarily work with a legacy /dev/ directory
where al the nodes have been pre-populated, and opening the device
node is supposed to load the module. So _historically_ we did indeed
load modules as normal users. But does that really happen any more?

I'd hate to default to historical behavior if there's no actual reason to do so.

                  Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
  2017-11-27 19:02     ` Linus Torvalds
  (?)
  (?)
@ 2017-11-27 19:12       ` Linus Torvalds
  -1 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 19:12 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 11:02 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Now, the above will not necessarily work with a legacy /dev/ directory
> where al the nodes have been pre-populated, and opening the device
> node is supposed to load the module. So _historically_ we did indeed
> load modules as normal users. But does that really happen any more?

Sadly, it looks like bluetoothd actually does expect to load the
bt-proto-XYZ modules with no capabilities at all.

So apparently we really do depend on not needing capabilities for
module loading.

Oh well.

                 Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 19:12       ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 19:12 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 11:02 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Now, the above will not necessarily work with a legacy /dev/ directory
> where al the nodes have been pre-populated, and opening the device
> node is supposed to load the module. So _historically_ we did indeed
> load modules as normal users. But does that really happen any more?

Sadly, it looks like bluetoothd actually does expect to load the
bt-proto-XYZ modules with no capabilities at all.

So apparently we really do depend on not needing capabilities for
module loading.

Oh well.

                 Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 19:12       ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 19:12 UTC (permalink / raw)
  To: linux-security-module

On Mon, Nov 27, 2017 at 11:02 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Now, the above will not necessarily work with a legacy /dev/ directory
> where al the nodes have been pre-populated, and opening the device
> node is supposed to load the module. So _historically_ we did indeed
> load modules as normal users. But does that really happen any more?

Sadly, it looks like bluetoothd actually does expect to load the
bt-proto-XYZ modules with no capabilities at all.

So apparently we really do depend on not needing capabilities for
module loading.

Oh well.

                 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 19:12       ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 19:12 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 11:02 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Now, the above will not necessarily work with a legacy /dev/ directory
> where al the nodes have been pre-populated, and opening the device
> node is supposed to load the module. So _historically_ we did indeed
> load modules as normal users. But does that really happen any more?

Sadly, it looks like bluetoothd actually does expect to load the
bt-proto-XYZ modules with no capabilities at all.

So apparently we really do depend on not needing capabilities for
module loading.

Oh well.

                 Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
  2017-11-27 18:41   ` Linus Torvalds
  (?)
@ 2017-11-27 19:14     ` David Miller
  -1 siblings, 0 replies; 266+ messages in thread
From: David Miller @ 2017-11-27 19:14 UTC (permalink / raw)
  To: torvalds
  Cc: tixxdz, keescook, luto, akpm, mcgrof, james.l.morris,
	ben.hutchings, solar, serge, jeyu, rusty, linux-kernel,
	linux-security-module, kernel-hardening, corbet, mingo, netdev,
	peterz

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon, 27 Nov 2017 10:41:30 -0800

> What are the real life use-cases for normal users having modules
> auto-load?

User opens SCTP socket, SCTP protocol module loads.

People build test cases via namespaces, and in that namespaces normal
users can setup virtual tunnel devices themselves, and those configure
operations can bring the tunnel module in.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 19:14     ` David Miller
  0 siblings, 0 replies; 266+ messages in thread
From: David Miller @ 2017-11-27 19:14 UTC (permalink / raw)
  To: linux-security-module

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon, 27 Nov 2017 10:41:30 -0800

> What are the real life use-cases for normal users having modules
> auto-load?

User opens SCTP socket, SCTP protocol module loads.

People build test cases via namespaces, and in that namespaces normal
users can setup virtual tunnel devices themselves, and those configure
operations can bring the tunnel module in.
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 19:14     ` David Miller
  0 siblings, 0 replies; 266+ messages in thread
From: David Miller @ 2017-11-27 19:14 UTC (permalink / raw)
  To: torvalds
  Cc: tixxdz, keescook, luto, akpm, mcgrof, james.l.morris,
	ben.hutchings, solar, serge, jeyu, rusty, linux-kernel,
	linux-security-module, kernel-hardening, corbet, mingo, netdev,
	peterz

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon, 27 Nov 2017 10:41:30 -0800

> What are the real life use-cases for normal users having modules
> auto-load?

User opens SCTP socket, SCTP protocol module loads.

People build test cases via namespaces, and in that namespaces normal
users can setup virtual tunnel devices themselves, and those configure
operations can bring the tunnel module in.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
  2017-11-27 19:12       ` Linus Torvalds
  (?)
  (?)
@ 2017-11-27 21:31         ` Djalal Harouni
  -1 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 21:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

Hi Linus,

On Mon, Nov 27, 2017 at 8:12 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 11:02 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> Now, the above will not necessarily work with a legacy /dev/ directory
>> where al the nodes have been pre-populated, and opening the device
>> node is supposed to load the module. So _historically_ we did indeed
>> load modules as normal users. But does that really happen any more?
>
> Sadly, it looks like bluetoothd actually does expect to load the
> bt-proto-XYZ modules with no capabilities at all.
>
> So apparently we really do depend on not needing capabilities for
> module loading.
>
> Oh well.

Yes DCCP is unprivileged, tun and all tunneling, some md drivers, some
crypto, and device drivers... fs modules can be loaded inside
usernamespaces, and maybe when some request requires external symbols
too...

However tunneling helps to solve real usecases, so that's why the
backward compatibility and opt-in.

I do perfectly understand that opt-in is not the best choice, however
these patchset includes a per process tree, and given that lot of code
is running in containers and sandboxes, it is better than nothing. I
will follow up later with patches to the major ones especially when we
force the flag by default. Ubuntu was said to be owned in a past
security contest due to this kind of things, and now since they have
ubuntu snaps or apps they can set the flag, and others will follow.

Thanks!

>                  Linus



-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 21:31         ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 21:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

Hi Linus,

On Mon, Nov 27, 2017 at 8:12 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 11:02 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> Now, the above will not necessarily work with a legacy /dev/ directory
>> where al the nodes have been pre-populated, and opening the device
>> node is supposed to load the module. So _historically_ we did indeed
>> load modules as normal users. But does that really happen any more?
>
> Sadly, it looks like bluetoothd actually does expect to load the
> bt-proto-XYZ modules with no capabilities at all.
>
> So apparently we really do depend on not needing capabilities for
> module loading.
>
> Oh well.

Yes DCCP is unprivileged, tun and all tunneling, some md drivers, some
crypto, and device drivers... fs modules can be loaded inside
usernamespaces, and maybe when some request requires external symbols
too...

However tunneling helps to solve real usecases, so that's why the
backward compatibility and opt-in.

I do perfectly understand that opt-in is not the best choice, however
these patchset includes a per process tree, and given that lot of code
is running in containers and sandboxes, it is better than nothing. I
will follow up later with patches to the major ones especially when we
force the flag by default. Ubuntu was said to be owned in a past
security contest due to this kind of things, and now since they have
ubuntu snaps or apps they can set the flag, and others will follow.

Thanks!

>                  Linus



-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 21:31         ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 21:31 UTC (permalink / raw)
  To: linux-security-module

Hi Linus,

On Mon, Nov 27, 2017 at 8:12 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 11:02 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> Now, the above will not necessarily work with a legacy /dev/ directory
>> where al the nodes have been pre-populated, and opening the device
>> node is supposed to load the module. So _historically_ we did indeed
>> load modules as normal users. But does that really happen any more?
>
> Sadly, it looks like bluetoothd actually does expect to load the
> bt-proto-XYZ modules with no capabilities at all.
>
> So apparently we really do depend on not needing capabilities for
> module loading.
>
> Oh well.

Yes DCCP is unprivileged, tun and all tunneling, some md drivers, some
crypto, and device drivers... fs modules can be loaded inside
usernamespaces, and maybe when some request requires external symbols
too...

However tunneling helps to solve real usecases, so that's why the
backward compatibility and opt-in.

I do perfectly understand that opt-in is not the best choice, however
these patchset includes a per process tree, and given that lot of code
is running in containers and sandboxes, it is better than nothing. I
will follow up later with patches to the major ones especially when we
force the flag by default. Ubuntu was said to be owned in a past
security contest due to this kind of things, and now since they have
ubuntu snaps or apps they can set the flag, and others will follow.

Thanks!

>                  Linus



-- 
tixxdz
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 21:31         ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 21:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

Hi Linus,

On Mon, Nov 27, 2017 at 8:12 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 11:02 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> Now, the above will not necessarily work with a legacy /dev/ directory
>> where al the nodes have been pre-populated, and opening the device
>> node is supposed to load the module. So _historically_ we did indeed
>> load modules as normal users. But does that really happen any more?
>
> Sadly, it looks like bluetoothd actually does expect to load the
> bt-proto-XYZ modules with no capabilities at all.
>
> So apparently we really do depend on not needing capabilities for
> module loading.
>
> Oh well.

Yes DCCP is unprivileged, tun and all tunneling, some md drivers, some
crypto, and device drivers... fs modules can be loaded inside
usernamespaces, and maybe when some request requires external symbols
too...

However tunneling helps to solve real usecases, so that's why the
backward compatibility and opt-in.

I do perfectly understand that opt-in is not the best choice, however
these patchset includes a per process tree, and given that lot of code
is running in containers and sandboxes, it is better than nothing. I
will follow up later with patches to the major ones especially when we
force the flag by default. Ubuntu was said to be owned in a past
security contest due to this kind of things, and now since they have
ubuntu snaps or apps they can set the flag, and others will follow.

Thanks!

>                  Linus



-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-27 18:48     ` Randy Dunlap
  (?)
@ 2017-11-27 21:35       ` Djalal Harouni
  -1 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 21:35 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

Hi Randy,

On Mon, Nov 27, 2017 at 7:48 PM, Randy Dunlap <rdunlap@infradead.org> wrote:
> Hi,
>
> Mostly typos/spellos...
>
>
> On 11/27/2017 09:18 AM, Djalal Harouni wrote:
>> Cc: Serge Hallyn <serge@hallyn.com>
>> Cc: Andy Lutomirski <luto@kernel.org>
>> Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
>> Suggested-by: Kees Cook <keescook@chromium.org>
>> Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
>> ---
>>  include/linux/kmod.h      | 65 ++++++++++++++++++++++++++++++++++++++++++-----
>>  include/linux/lsm_hooks.h |  6 ++++-
>>  include/linux/security.h  |  7 +++--
>>  kernel/kmod.c             | 29 ++++++++++++++++-----
>>  security/security.c       |  6 +++--
>>  security/selinux/hooks.c  |  3 ++-
>>  6 files changed, 97 insertions(+), 19 deletions(-)
>>
>> diff --git a/include/linux/kmod.h b/include/linux/kmod.h
>> index 40c89ad..ccd6a1c 100644
>> --- a/include/linux/kmod.h
>> +++ b/include/linux/kmod.h
>> @@ -33,16 +33,67 @@
>
>> +/**
>> + * request_module  Try to load a kernel module
>> + *
>> + * Automatically loads the request module.
>> + *
>> + * @mod...: The module name
>> + */
>
> what are the "..." for?  what do they do here?

Ok, will fix it.

>
>> +#define request_module(mod...) __request_module(true, -1, NULL, mod)
>> +
>> +#define request_module_nowait(mod...) __request_module(false, -1, NULL, mod)
>> +
>> +/**
>> + * request_module_cap  Load kernel module only if the required capability is set
>> + *
[...]
>
>
> --
> ~Randy

Thank you very much for the review, will fix all.


-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-27 21:35       ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 21:35 UTC (permalink / raw)
  To: linux-security-module

Hi Randy,

On Mon, Nov 27, 2017 at 7:48 PM, Randy Dunlap <rdunlap@infradead.org> wrote:
> Hi,
>
> Mostly typos/spellos...
>
>
> On 11/27/2017 09:18 AM, Djalal Harouni wrote:
>> Cc: Serge Hallyn <serge@hallyn.com>
>> Cc: Andy Lutomirski <luto@kernel.org>
>> Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
>> Suggested-by: Kees Cook <keescook@chromium.org>
>> Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
>> ---
>>  include/linux/kmod.h      | 65 ++++++++++++++++++++++++++++++++++++++++++-----
>>  include/linux/lsm_hooks.h |  6 ++++-
>>  include/linux/security.h  |  7 +++--
>>  kernel/kmod.c             | 29 ++++++++++++++++-----
>>  security/security.c       |  6 +++--
>>  security/selinux/hooks.c  |  3 ++-
>>  6 files changed, 97 insertions(+), 19 deletions(-)
>>
>> diff --git a/include/linux/kmod.h b/include/linux/kmod.h
>> index 40c89ad..ccd6a1c 100644
>> --- a/include/linux/kmod.h
>> +++ b/include/linux/kmod.h
>> @@ -33,16 +33,67 @@
>
>> +/**
>> + * request_module  Try to load a kernel module
>> + *
>> + * Automatically loads the request module.
>> + *
>> + * @mod...: The module name
>> + */
>
> what are the "..." for?  what do they do here?

Ok, will fix it.

>
>> +#define request_module(mod...) __request_module(true, -1, NULL, mod)
>> +
>> +#define request_module_nowait(mod...) __request_module(false, -1, NULL, mod)
>> +
>> +/**
>> + * request_module_cap  Load kernel module only if the required capability is set
>> + *
[...]
>
>
> --
> ~Randy

Thank you very much for the review, will fix all.


-- 
tixxdz
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-27 21:35       ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 21:35 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

Hi Randy,

On Mon, Nov 27, 2017 at 7:48 PM, Randy Dunlap <rdunlap@infradead.org> wrote:
> Hi,
>
> Mostly typos/spellos...
>
>
> On 11/27/2017 09:18 AM, Djalal Harouni wrote:
>> Cc: Serge Hallyn <serge@hallyn.com>
>> Cc: Andy Lutomirski <luto@kernel.org>
>> Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
>> Suggested-by: Kees Cook <keescook@chromium.org>
>> Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
>> ---
>>  include/linux/kmod.h      | 65 ++++++++++++++++++++++++++++++++++++++++++-----
>>  include/linux/lsm_hooks.h |  6 ++++-
>>  include/linux/security.h  |  7 +++--
>>  kernel/kmod.c             | 29 ++++++++++++++++-----
>>  security/security.c       |  6 +++--
>>  security/selinux/hooks.c  |  3 ++-
>>  6 files changed, 97 insertions(+), 19 deletions(-)
>>
>> diff --git a/include/linux/kmod.h b/include/linux/kmod.h
>> index 40c89ad..ccd6a1c 100644
>> --- a/include/linux/kmod.h
>> +++ b/include/linux/kmod.h
>> @@ -33,16 +33,67 @@
>
>> +/**
>> + * request_module  Try to load a kernel module
>> + *
>> + * Automatically loads the request module.
>> + *
>> + * @mod...: The module name
>> + */
>
> what are the "..." for?  what do they do here?

Ok, will fix it.

>
>> +#define request_module(mod...) __request_module(true, -1, NULL, mod)
>> +
>> +#define request_module_nowait(mod...) __request_module(false, -1, NULL, mod)
>> +
>> +/**
>> + * request_module_cap  Load kernel module only if the required capability is set
>> + *
[...]
>
>
> --
> ~Randy

Thank you very much for the review, will fix all.


-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-27 18:44     ` Linus Torvalds
  (?)
  (?)
@ 2017-11-27 21:41       ` Djalal Harouni
  -1 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 21:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

Hi Linus,

On Mon, Nov 27, 2017 at 7:44 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 9:18 AM, Djalal Harouni <tixxdz@gmail.com> wrote:
>> This uses the new request_module_cap() facility to directly propagate
>> CAP_NET_ADMIN capability and the 'netdev' module prefix to the
>> capability subsystem as it was suggested.
>
> This is the kind of complexity that I wonder if it's worth it at all.
>
> Nobody sane actually uses those stupid capability bits. Have you ever
> actually seen it used in real life?

Yes they are complicated even for developers, and normal users do not
understand them, however yes every sandbox and container is exposing
them to endusers directly, they are documented!  so yes CAP_SYS_MODULE
is exposed but it does not cover autoloading.

However, we are trying hard to abstract some semantics that are easy
to grasp, we are mutating capabilities and seccomp to have an
abstracted "yes/no" options for our endusers.

Now, if you are referring to kernel code, the networking subsystem is
using them and I don't want to break any assumption here. There is
still the request_module(), the request_module_cap() was suggested so
networking code later won't have to do the checks on its own, and
maybe it can be consistent in the long term. The phonet sockets even
needs CAP_SYS_ADMIN...


>
> They were a mistake, and we should never have done them - another case
> of security people who think that complexity == security, when in
> reality nobody actually wants the complexity or is willing to set it
> up and manage it.

Alright, but I guess we are stuck, is there something better on how we
can do this or describe this ?


Please note in these patches, the mode is specifically described as:

* allowed: for backward compatibility  (I would have done without it)
* privileged: which includes capabilities (backward compatibility too)
or we can add what ever in the future
* disabled: even for privileged.

So I would have preferred if it is something like "yes/no" but...
However in userspace we will try hard to hide this complexity and the
capability bits.

Now I can see that the code comments and doc refer to privileged with
capabilities a lot, where we can maybe update that doc and code to
less state that privileged means capabilities ? Suggestions ?

Thanks!

>                    Linus


-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 21:41       ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 21:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

Hi Linus,

On Mon, Nov 27, 2017 at 7:44 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 9:18 AM, Djalal Harouni <tixxdz@gmail.com> wrote:
>> This uses the new request_module_cap() facility to directly propagate
>> CAP_NET_ADMIN capability and the 'netdev' module prefix to the
>> capability subsystem as it was suggested.
>
> This is the kind of complexity that I wonder if it's worth it at all.
>
> Nobody sane actually uses those stupid capability bits. Have you ever
> actually seen it used in real life?

Yes they are complicated even for developers, and normal users do not
understand them, however yes every sandbox and container is exposing
them to endusers directly, they are documented!  so yes CAP_SYS_MODULE
is exposed but it does not cover autoloading.

However, we are trying hard to abstract some semantics that are easy
to grasp, we are mutating capabilities and seccomp to have an
abstracted "yes/no" options for our endusers.

Now, if you are referring to kernel code, the networking subsystem is
using them and I don't want to break any assumption here. There is
still the request_module(), the request_module_cap() was suggested so
networking code later won't have to do the checks on its own, and
maybe it can be consistent in the long term. The phonet sockets even
needs CAP_SYS_ADMIN...


>
> They were a mistake, and we should never have done them - another case
> of security people who think that complexity == security, when in
> reality nobody actually wants the complexity or is willing to set it
> up and manage it.

Alright, but I guess we are stuck, is there something better on how we
can do this or describe this ?


Please note in these patches, the mode is specifically described as:

* allowed: for backward compatibility  (I would have done without it)
* privileged: which includes capabilities (backward compatibility too)
or we can add what ever in the future
* disabled: even for privileged.

So I would have preferred if it is something like "yes/no" but...
However in userspace we will try hard to hide this complexity and the
capability bits.

Now I can see that the code comments and doc refer to privileged with
capabilities a lot, where we can maybe update that doc and code to
less state that privileged means capabilities ? Suggestions ?

Thanks!

>                    Linus


-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 21:41       ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 21:41 UTC (permalink / raw)
  To: linux-security-module

Hi Linus,

On Mon, Nov 27, 2017 at 7:44 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 9:18 AM, Djalal Harouni <tixxdz@gmail.com> wrote:
>> This uses the new request_module_cap() facility to directly propagate
>> CAP_NET_ADMIN capability and the 'netdev' module prefix to the
>> capability subsystem as it was suggested.
>
> This is the kind of complexity that I wonder if it's worth it at all.
>
> Nobody sane actually uses those stupid capability bits. Have you ever
> actually seen it used in real life?

Yes they are complicated even for developers, and normal users do not
understand them, however yes every sandbox and container is exposing
them to endusers directly, they are documented!  so yes CAP_SYS_MODULE
is exposed but it does not cover autoloading.

However, we are trying hard to abstract some semantics that are easy
to grasp, we are mutating capabilities and seccomp to have an
abstracted "yes/no" options for our endusers.

Now, if you are referring to kernel code, the networking subsystem is
using them and I don't want to break any assumption here. There is
still the request_module(), the request_module_cap() was suggested so
networking code later won't have to do the checks on its own, and
maybe it can be consistent in the long term. The phonet sockets even
needs CAP_SYS_ADMIN...


>
> They were a mistake, and we should never have done them - another case
> of security people who think that complexity == security, when in
> reality nobody actually wants the complexity or is willing to set it
> up and manage it.

Alright, but I guess we are stuck, is there something better on how we
can do this or describe this ?


Please note in these patches, the mode is specifically described as:

* allowed: for backward compatibility  (I would have done without it)
* privileged: which includes capabilities (backward compatibility too)
or we can add what ever in the future
* disabled: even for privileged.

So I would have preferred if it is something like "yes/no" but...
However in userspace we will try hard to hide this complexity and the
capability bits.

Now I can see that the code comments and doc refer to privileged with
capabilities a lot, where we can maybe update that doc and code to
less state that privileged means capabilities ? Suggestions ?

Thanks!

>                    Linus


-- 
tixxdz
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 21:41       ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-27 21:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

Hi Linus,

On Mon, Nov 27, 2017 at 7:44 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 9:18 AM, Djalal Harouni <tixxdz@gmail.com> wrote:
>> This uses the new request_module_cap() facility to directly propagate
>> CAP_NET_ADMIN capability and the 'netdev' module prefix to the
>> capability subsystem as it was suggested.
>
> This is the kind of complexity that I wonder if it's worth it at all.
>
> Nobody sane actually uses those stupid capability bits. Have you ever
> actually seen it used in real life?

Yes they are complicated even for developers, and normal users do not
understand them, however yes every sandbox and container is exposing
them to endusers directly, they are documented!  so yes CAP_SYS_MODULE
is exposed but it does not cover autoloading.

However, we are trying hard to abstract some semantics that are easy
to grasp, we are mutating capabilities and seccomp to have an
abstracted "yes/no" options for our endusers.

Now, if you are referring to kernel code, the networking subsystem is
using them and I don't want to break any assumption here. There is
still the request_module(), the request_module_cap() was suggested so
networking code later won't have to do the checks on its own, and
maybe it can be consistent in the long term. The phonet sockets even
needs CAP_SYS_ADMIN...


>
> They were a mistake, and we should never have done them - another case
> of security people who think that complexity == security, when in
> reality nobody actually wants the complexity or is willing to set it
> up and manage it.

Alright, but I guess we are stuck, is there something better on how we
can do this or describe this ?


Please note in these patches, the mode is specifically described as:

* allowed: for backward compatibility  (I would have done without it)
* privileged: which includes capabilities (backward compatibility too)
or we can add what ever in the future
* disabled: even for privileged.

So I would have preferred if it is something like "yes/no" but...
However in userspace we will try hard to hide this complexity and the
capability bits.

Now I can see that the code comments and doc refer to privileged with
capabilities a lot, where we can maybe update that doc and code to
less state that privileged means capabilities ? Suggestions ?

Thanks!

>                    Linus


-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-27 21:41       ` Djalal Harouni
  (?)
  (?)
@ 2017-11-27 22:04         ` Linus Torvalds
  -1 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 22:04 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 1:41 PM, Djalal Harouni <tixxdz@gmail.com> wrote:
>
> However, we are trying hard to abstract some semantics that are easy
> to grasp, we are mutating capabilities and seccomp to have an
> abstracted "yes/no" options for our endusers.

Yes.

Sadly, it looks like we actually do have users that just expect to
load modules dynamically without any capabilities at all.

So we can't actually disallow it by default at all, which imho makes
this security option essentially useless.

A security option that people can't use without breaking their system
is pointless.

We saw that with SELinux - people ended up just disabling it for
_years_, simply because it ended up breaking so much in practice. And
yes, it got fixed eventually, but at an incredibly high maintenance
cost of all the crazy rules databases.

> Alright, but I guess we are stuck, is there something better on how we
> can do this or describe this ?

So I wonder if we can perhaps look at the places that actually do
"requerst_module()", and start filtering them on that basis.

Some of them will already have checked for capabilities.

Others clearly expect to juist work even _without_ capabilities (ie
the bluetoothd case).

So the whole "let's add a global config option" model is broken. There
is no possible global rule. It will break things, which in turn mean
that people won't turn it on (and we can't turn it on by default),
which in turn makes this pointless.

In other words, I really think that anything that just adds a mode
flag cannot work.

So instead of having one "modules_autoload_mode" thing, maybe the
individual requerst_module() cases need to simply be audited.

Put another way: I think the part of your patch series that does that
"request_module_cap()" and makes the netdev modules use it is a good
addition.

It's the "mode" part I really don't agree with, because apparently we
really need to default it to permissive.

So how about instead:

 - add that "request_module_cap()" and make the networking code that
already uses CAP_ADMIN_NET use it.

 - make "request_module()" itself default to being
"request_module_cap(CAP_SYS_MODULE,..)"

 - make sure that when the capability check fails, we print an error
message, and then for the ones that trigger, we will audit them and
see if it's ok.

Because that "mode" flag defaulting to off will just mean that the
default case will remain the existing unsafe one, and that's bad.

Opt-in really doesn't work. We've done it.

Global flags for varied behavior really doesn't work. We've done that
too. Different cases want different behavior, the global flag is just
fundamentally broken.

              Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 22:04         ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 22:04 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 1:41 PM, Djalal Harouni <tixxdz@gmail.com> wrote:
>
> However, we are trying hard to abstract some semantics that are easy
> to grasp, we are mutating capabilities and seccomp to have an
> abstracted "yes/no" options for our endusers.

Yes.

Sadly, it looks like we actually do have users that just expect to
load modules dynamically without any capabilities at all.

So we can't actually disallow it by default at all, which imho makes
this security option essentially useless.

A security option that people can't use without breaking their system
is pointless.

We saw that with SELinux - people ended up just disabling it for
_years_, simply because it ended up breaking so much in practice. And
yes, it got fixed eventually, but at an incredibly high maintenance
cost of all the crazy rules databases.

> Alright, but I guess we are stuck, is there something better on how we
> can do this or describe this ?

So I wonder if we can perhaps look at the places that actually do
"requerst_module()", and start filtering them on that basis.

Some of them will already have checked for capabilities.

Others clearly expect to juist work even _without_ capabilities (ie
the bluetoothd case).

So the whole "let's add a global config option" model is broken. There
is no possible global rule. It will break things, which in turn mean
that people won't turn it on (and we can't turn it on by default),
which in turn makes this pointless.

In other words, I really think that anything that just adds a mode
flag cannot work.

So instead of having one "modules_autoload_mode" thing, maybe the
individual requerst_module() cases need to simply be audited.

Put another way: I think the part of your patch series that does that
"request_module_cap()" and makes the netdev modules use it is a good
addition.

It's the "mode" part I really don't agree with, because apparently we
really need to default it to permissive.

So how about instead:

 - add that "request_module_cap()" and make the networking code that
already uses CAP_ADMIN_NET use it.

 - make "request_module()" itself default to being
"request_module_cap(CAP_SYS_MODULE,..)"

 - make sure that when the capability check fails, we print an error
message, and then for the ones that trigger, we will audit them and
see if it's ok.

Because that "mode" flag defaulting to off will just mean that the
default case will remain the existing unsafe one, and that's bad.

Opt-in really doesn't work. We've done it.

Global flags for varied behavior really doesn't work. We've done that
too. Different cases want different behavior, the global flag is just
fundamentally broken.

              Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 22:04         ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 22:04 UTC (permalink / raw)
  To: linux-security-module

On Mon, Nov 27, 2017 at 1:41 PM, Djalal Harouni <tixxdz@gmail.com> wrote:
>
> However, we are trying hard to abstract some semantics that are easy
> to grasp, we are mutating capabilities and seccomp to have an
> abstracted "yes/no" options for our endusers.

Yes.

Sadly, it looks like we actually do have users that just expect to
load modules dynamically without any capabilities at all.

So we can't actually disallow it by default at all, which imho makes
this security option essentially useless.

A security option that people can't use without breaking their system
is pointless.

We saw that with SELinux - people ended up just disabling it for
_years_, simply because it ended up breaking so much in practice. And
yes, it got fixed eventually, but at an incredibly high maintenance
cost of all the crazy rules databases.

> Alright, but I guess we are stuck, is there something better on how we
> can do this or describe this ?

So I wonder if we can perhaps look at the places that actually do
"requerst_module()", and start filtering them on that basis.

Some of them will already have checked for capabilities.

Others clearly expect to juist work even _without_ capabilities (ie
the bluetoothd case).

So the whole "let's add a global config option" model is broken. There
is no possible global rule. It will break things, which in turn mean
that people won't turn it on (and we can't turn it on by default),
which in turn makes this pointless.

In other words, I really think that anything that just adds a mode
flag cannot work.

So instead of having one "modules_autoload_mode" thing, maybe the
individual requerst_module() cases need to simply be audited.

Put another way: I think the part of your patch series that does that
"request_module_cap()" and makes the netdev modules use it is a good
addition.

It's the "mode" part I really don't agree with, because apparently we
really need to default it to permissive.

So how about instead:

 - add that "request_module_cap()" and make the networking code that
already uses CAP_ADMIN_NET use it.

 - make "request_module()" itself default to being
"request_module_cap(CAP_SYS_MODULE,..)"

 - make sure that when the capability check fails, we print an error
message, and then for the ones that trigger, we will audit them and
see if it's ok.

Because that "mode" flag defaulting to off will just mean that the
default case will remain the existing unsafe one, and that's bad.

Opt-in really doesn't work. We've done it.

Global flags for varied behavior really doesn't work. We've done that
too. Different cases want different behavior, the global flag is just
fundamentally broken.

              Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 22:04         ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 22:04 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 1:41 PM, Djalal Harouni <tixxdz@gmail.com> wrote:
>
> However, we are trying hard to abstract some semantics that are easy
> to grasp, we are mutating capabilities and seccomp to have an
> abstracted "yes/no" options for our endusers.

Yes.

Sadly, it looks like we actually do have users that just expect to
load modules dynamically without any capabilities at all.

So we can't actually disallow it by default at all, which imho makes
this security option essentially useless.

A security option that people can't use without breaking their system
is pointless.

We saw that with SELinux - people ended up just disabling it for
_years_, simply because it ended up breaking so much in practice. And
yes, it got fixed eventually, but at an incredibly high maintenance
cost of all the crazy rules databases.

> Alright, but I guess we are stuck, is there something better on how we
> can do this or describe this ?

So I wonder if we can perhaps look at the places that actually do
"requerst_module()", and start filtering them on that basis.

Some of them will already have checked for capabilities.

Others clearly expect to juist work even _without_ capabilities (ie
the bluetoothd case).

So the whole "let's add a global config option" model is broken. There
is no possible global rule. It will break things, which in turn mean
that people won't turn it on (and we can't turn it on by default),
which in turn makes this pointless.

In other words, I really think that anything that just adds a mode
flag cannot work.

So instead of having one "modules_autoload_mode" thing, maybe the
individual requerst_module() cases need to simply be audited.

Put another way: I think the part of your patch series that does that
"request_module_cap()" and makes the netdev modules use it is a good
addition.

It's the "mode" part I really don't agree with, because apparently we
really need to default it to permissive.

So how about instead:

 - add that "request_module_cap()" and make the networking code that
already uses CAP_ADMIN_NET use it.

 - make "request_module()" itself default to being
"request_module_cap(CAP_SYS_MODULE,..)"

 - make sure that when the capability check fails, we print an error
message, and then for the ones that trigger, we will audit them and
see if it's ok.

Because that "mode" flag defaulting to off will just mean that the
default case will remain the existing unsafe one, and that's bad.

Opt-in really doesn't work. We've done it.

Global flags for varied behavior really doesn't work. We've done that
too. Different cases want different behavior, the global flag is just
fundamentally broken.

              Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
  2017-11-27 19:14     ` David Miller
  (?)
@ 2017-11-27 22:31       ` James Morris
  -1 siblings, 0 replies; 266+ messages in thread
From: James Morris @ 2017-11-27 22:31 UTC (permalink / raw)
  To: David Miller
  Cc: torvalds, tixxdz, keescook, luto, akpm, mcgrof, ben.hutchings,
	solar, serge, jeyu, rusty, linux-kernel, linux-security-module,
	kernel-hardening, corbet, mingo, netdev, peterz

On Tue, 28 Nov 2017, David Miller wrote:

> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Mon, 27 Nov 2017 10:41:30 -0800
> 
> > What are the real life use-cases for normal users having modules
> > auto-load?
> 
> User opens SCTP socket, SCTP protocol module loads.
> 
> People build test cases via namespaces, and in that namespaces normal
> users can setup virtual tunnel devices themselves, and those configure
> operations can bring the tunnel module in.

What about implementing a white list of modules which are able to be 
loaded by unprivileged users?

Then, Linus' solution would look something like:

	va_start(args, fmt);
	ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
	va_end(args);

	if (WARN_ON_ONCE(!capable(CAP_SYS_MODULE) ||
                         !capable(CAP_SYS_ADMIN) ||
                         !capable(CAP_NET_ADMIN) ||
                         !unprivileged_autoload(module_name)))
		return -EPERM;



-- 
James Morris
<james.l.morris@oracle.com>

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 22:31       ` James Morris
  0 siblings, 0 replies; 266+ messages in thread
From: James Morris @ 2017-11-27 22:31 UTC (permalink / raw)
  To: linux-security-module

On Tue, 28 Nov 2017, David Miller wrote:

> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Mon, 27 Nov 2017 10:41:30 -0800
> 
> > What are the real life use-cases for normal users having modules
> > auto-load?
> 
> User opens SCTP socket, SCTP protocol module loads.
> 
> People build test cases via namespaces, and in that namespaces normal
> users can setup virtual tunnel devices themselves, and those configure
> operations can bring the tunnel module in.

What about implementing a white list of modules which are able to be 
loaded by unprivileged users?

Then, Linus' solution would look something like:

	va_start(args, fmt);
	ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
	va_end(args);

	if (WARN_ON_ONCE(!capable(CAP_SYS_MODULE) ||
                         !capable(CAP_SYS_ADMIN) ||
                         !capable(CAP_NET_ADMIN) ||
                         !unprivileged_autoload(module_name)))
		return -EPERM;



-- 
James Morris
<james.l.morris@oracle.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 22:31       ` James Morris
  0 siblings, 0 replies; 266+ messages in thread
From: James Morris @ 2017-11-27 22:31 UTC (permalink / raw)
  To: David Miller
  Cc: torvalds, tixxdz, keescook, luto, akpm, mcgrof, ben.hutchings,
	solar, serge, jeyu, rusty, linux-kernel, linux-security-module,
	kernel-hardening, corbet, mingo, netdev, peterz

On Tue, 28 Nov 2017, David Miller wrote:

> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Mon, 27 Nov 2017 10:41:30 -0800
> 
> > What are the real life use-cases for normal users having modules
> > auto-load?
> 
> User opens SCTP socket, SCTP protocol module loads.
> 
> People build test cases via namespaces, and in that namespaces normal
> users can setup virtual tunnel devices themselves, and those configure
> operations can bring the tunnel module in.

What about implementing a white list of modules which are able to be 
loaded by unprivileged users?

Then, Linus' solution would look something like:

	va_start(args, fmt);
	ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
	va_end(args);

	if (WARN_ON_ONCE(!capable(CAP_SYS_MODULE) ||
                         !capable(CAP_SYS_ADMIN) ||
                         !capable(CAP_NET_ADMIN) ||
                         !unprivileged_autoload(module_name)))
		return -EPERM;



-- 
James Morris
<james.l.morris@oracle.com>

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-27 22:04         ` Linus Torvalds
  (?)
  (?)
@ 2017-11-27 22:59           ` Kees Cook
  -1 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-27 22:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 2:04 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 1:41 PM, Djalal Harouni <tixxdz@gmail.com> wrote:
>>
>> However, we are trying hard to abstract some semantics that are easy
>> to grasp, we are mutating capabilities and seccomp to have an
>> abstracted "yes/no" options for our endusers.
>
> Yes.
>
> Sadly, it looks like we actually do have users that just expect to
> load modules dynamically without any capabilities at all.
>
> So we can't actually disallow it by default at all, which imho makes
> this security option essentially useless.

The good news here is that this series comes with an expected user:
systemd, for which I suspect this will option will get wider and wider
use.

> A security option that people can't use without breaking their system
> is pointless.

Another bit of good news is that systems _depending_ cap-less module
loading are uncommon enough that many system builders can enable this
protection and nothing will break. (Even you thought it was rare
enough that we could just make this default-enabled.)

Besides, distros understand that they have to keep an eye out for
these things since the kernel has a long history of not allowing
default-enabled protections that change userspace behavior, etc.

> So I wonder if we can perhaps look at the places that actually do
> "requerst_module()", and start filtering them on that basis.
>
> Some of them will already have checked for capabilities.
>
> Others clearly expect to just work even _without_ capabilities (ie
> the bluetoothd case).
>
> So the whole "let's add a global config option" model is broken. There
> is no possible global rule. It will break things, which in turn mean
> that people won't turn it on (and we can't turn it on by default),
> which in turn makes this pointless.
>
> In other words, I really think that anything that just adds a mode
> flag cannot work.
>
> So instead of having one "modules_autoload_mode" thing, maybe the
> individual requerst_module() cases need to simply be audited.
>
> Put another way: I think the part of your patch series that does that
> "request_module_cap()" and makes the netdev modules use it is a good
> addition.
>
> It's the "mode" part I really don't agree with, because apparently we
> really need to default it to permissive.
>
> So how about instead:
>
>  - add that "request_module_cap()" and make the networking code that
> already uses CAP_ADMIN_NET use it.
>
>  - make "request_module()" itself default to being
> "request_module_cap(CAP_SYS_MODULE,..)"
>
>  - make sure that when the capability check fails, we print an error
> message, and then for the ones that trigger, we will audit them and
> see if it's ok.

The issue isn't "is it always okay to cap-less-load this module?" it's
"should the kernel's attack surface be allowed to grown by an
unprivileged user?" Nearly all the request_module() sites have already
been audited and adjusted to avoid _arbitrary_ module loading (usually
by adding module prefixes: netdev-, crypto-, etc), and that greatly
helped reduce the ability for a unprivileged user to load a
known-buggy module, but there will remain many subsystems that
continue to expect to grow the kernel's features on demand within
their subsystem, and sometimes those modules will have bugs that an
unprivileged user can exploit (this history of this happening is quite
real; it's not a theoretical concern). This is especially problematic
for distro kernels where they build tons of modules.

There isn't a way for an admin to say "I only want root to be able to
load modules". They can't delete all the "unused" kernel modules,
because maybe root will want the module, they can't change module file
permissions because the modules are read with init's permissions (and
it would be terrible to repeatedly change all the file perms each time
a new kernel got installed), modules can only be blacklisted (not
whitelisted) in /etc/modprobe.d/, etc.

> Because that "mode" flag defaulting to off will just mean that the
> default case will remain the existing unsafe one, and that's bad.
>
> Opt-in really doesn't work. We've done it.
>
> Global flags for varied behavior really doesn't work. We've done that
> too. Different cases want different behavior, the global flag is just
> fundamentally broken.

I don't disagree that a global should be avoided, but I'm struggling
to see another option here. We can't break userspace by default so we
can't restrict cap-less loading by default. But we can allow userspace
to _choose_ to break itself, especially within a container. This isn't
uncommon, especially for modules, where we even have the global
"modules_disabled" sysctl already. The level of granularity of control
here is the issue, and it's what this series solves.

The options I see for module loading control are:
1) monolithic kernel (no modules)
2) modular kernel that flips on modules_disabled after boot (no
modules after boot)
3) modular kernel that allows per-subsystem unpriv module loading (all
modules loadable)

There is a demand for something between 2 and 3 where only root can
load modules. (And as pointed out in the series, this is _especially_
true for containers where the admin may want to leave module loading
alone in the init namespace, but stop any module loading in the
container.)

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 22:59           ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-27 22:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 2:04 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 1:41 PM, Djalal Harouni <tixxdz@gmail.com> wrote:
>>
>> However, we are trying hard to abstract some semantics that are easy
>> to grasp, we are mutating capabilities and seccomp to have an
>> abstracted "yes/no" options for our endusers.
>
> Yes.
>
> Sadly, it looks like we actually do have users that just expect to
> load modules dynamically without any capabilities at all.
>
> So we can't actually disallow it by default at all, which imho makes
> this security option essentially useless.

The good news here is that this series comes with an expected user:
systemd, for which I suspect this will option will get wider and wider
use.

> A security option that people can't use without breaking their system
> is pointless.

Another bit of good news is that systems _depending_ cap-less module
loading are uncommon enough that many system builders can enable this
protection and nothing will break. (Even you thought it was rare
enough that we could just make this default-enabled.)

Besides, distros understand that they have to keep an eye out for
these things since the kernel has a long history of not allowing
default-enabled protections that change userspace behavior, etc.

> So I wonder if we can perhaps look at the places that actually do
> "requerst_module()", and start filtering them on that basis.
>
> Some of them will already have checked for capabilities.
>
> Others clearly expect to just work even _without_ capabilities (ie
> the bluetoothd case).
>
> So the whole "let's add a global config option" model is broken. There
> is no possible global rule. It will break things, which in turn mean
> that people won't turn it on (and we can't turn it on by default),
> which in turn makes this pointless.
>
> In other words, I really think that anything that just adds a mode
> flag cannot work.
>
> So instead of having one "modules_autoload_mode" thing, maybe the
> individual requerst_module() cases need to simply be audited.
>
> Put another way: I think the part of your patch series that does that
> "request_module_cap()" and makes the netdev modules use it is a good
> addition.
>
> It's the "mode" part I really don't agree with, because apparently we
> really need to default it to permissive.
>
> So how about instead:
>
>  - add that "request_module_cap()" and make the networking code that
> already uses CAP_ADMIN_NET use it.
>
>  - make "request_module()" itself default to being
> "request_module_cap(CAP_SYS_MODULE,..)"
>
>  - make sure that when the capability check fails, we print an error
> message, and then for the ones that trigger, we will audit them and
> see if it's ok.

The issue isn't "is it always okay to cap-less-load this module?" it's
"should the kernel's attack surface be allowed to grown by an
unprivileged user?" Nearly all the request_module() sites have already
been audited and adjusted to avoid _arbitrary_ module loading (usually
by adding module prefixes: netdev-, crypto-, etc), and that greatly
helped reduce the ability for a unprivileged user to load a
known-buggy module, but there will remain many subsystems that
continue to expect to grow the kernel's features on demand within
their subsystem, and sometimes those modules will have bugs that an
unprivileged user can exploit (this history of this happening is quite
real; it's not a theoretical concern). This is especially problematic
for distro kernels where they build tons of modules.

There isn't a way for an admin to say "I only want root to be able to
load modules". They can't delete all the "unused" kernel modules,
because maybe root will want the module, they can't change module file
permissions because the modules are read with init's permissions (and
it would be terrible to repeatedly change all the file perms each time
a new kernel got installed), modules can only be blacklisted (not
whitelisted) in /etc/modprobe.d/, etc.

> Because that "mode" flag defaulting to off will just mean that the
> default case will remain the existing unsafe one, and that's bad.
>
> Opt-in really doesn't work. We've done it.
>
> Global flags for varied behavior really doesn't work. We've done that
> too. Different cases want different behavior, the global flag is just
> fundamentally broken.

I don't disagree that a global should be avoided, but I'm struggling
to see another option here. We can't break userspace by default so we
can't restrict cap-less loading by default. But we can allow userspace
to _choose_ to break itself, especially within a container. This isn't
uncommon, especially for modules, where we even have the global
"modules_disabled" sysctl already. The level of granularity of control
here is the issue, and it's what this series solves.

The options I see for module loading control are:
1) monolithic kernel (no modules)
2) modular kernel that flips on modules_disabled after boot (no
modules after boot)
3) modular kernel that allows per-subsystem unpriv module loading (all
modules loadable)

There is a demand for something between 2 and 3 where only root can
load modules. (And as pointed out in the series, this is _especially_
true for containers where the admin may want to leave module loading
alone in the init namespace, but stop any module loading in the
container.)

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 22:59           ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-27 22:59 UTC (permalink / raw)
  To: linux-security-module

On Mon, Nov 27, 2017 at 2:04 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 1:41 PM, Djalal Harouni <tixxdz@gmail.com> wrote:
>>
>> However, we are trying hard to abstract some semantics that are easy
>> to grasp, we are mutating capabilities and seccomp to have an
>> abstracted "yes/no" options for our endusers.
>
> Yes.
>
> Sadly, it looks like we actually do have users that just expect to
> load modules dynamically without any capabilities at all.
>
> So we can't actually disallow it by default at all, which imho makes
> this security option essentially useless.

The good news here is that this series comes with an expected user:
systemd, for which I suspect this will option will get wider and wider
use.

> A security option that people can't use without breaking their system
> is pointless.

Another bit of good news is that systems _depending_ cap-less module
loading are uncommon enough that many system builders can enable this
protection and nothing will break. (Even you thought it was rare
enough that we could just make this default-enabled.)

Besides, distros understand that they have to keep an eye out for
these things since the kernel has a long history of not allowing
default-enabled protections that change userspace behavior, etc.

> So I wonder if we can perhaps look at the places that actually do
> "requerst_module()", and start filtering them on that basis.
>
> Some of them will already have checked for capabilities.
>
> Others clearly expect to just work even _without_ capabilities (ie
> the bluetoothd case).
>
> So the whole "let's add a global config option" model is broken. There
> is no possible global rule. It will break things, which in turn mean
> that people won't turn it on (and we can't turn it on by default),
> which in turn makes this pointless.
>
> In other words, I really think that anything that just adds a mode
> flag cannot work.
>
> So instead of having one "modules_autoload_mode" thing, maybe the
> individual requerst_module() cases need to simply be audited.
>
> Put another way: I think the part of your patch series that does that
> "request_module_cap()" and makes the netdev modules use it is a good
> addition.
>
> It's the "mode" part I really don't agree with, because apparently we
> really need to default it to permissive.
>
> So how about instead:
>
>  - add that "request_module_cap()" and make the networking code that
> already uses CAP_ADMIN_NET use it.
>
>  - make "request_module()" itself default to being
> "request_module_cap(CAP_SYS_MODULE,..)"
>
>  - make sure that when the capability check fails, we print an error
> message, and then for the ones that trigger, we will audit them and
> see if it's ok.

The issue isn't "is it always okay to cap-less-load this module?" it's
"should the kernel's attack surface be allowed to grown by an
unprivileged user?" Nearly all the request_module() sites have already
been audited and adjusted to avoid _arbitrary_ module loading (usually
by adding module prefixes: netdev-, crypto-, etc), and that greatly
helped reduce the ability for a unprivileged user to load a
known-buggy module, but there will remain many subsystems that
continue to expect to grow the kernel's features on demand within
their subsystem, and sometimes those modules will have bugs that an
unprivileged user can exploit (this history of this happening is quite
real; it's not a theoretical concern). This is especially problematic
for distro kernels where they build tons of modules.

There isn't a way for an admin to say "I only want root to be able to
load modules". They can't delete all the "unused" kernel modules,
because maybe root will want the module, they can't change module file
permissions because the modules are read with init's permissions (and
it would be terrible to repeatedly change all the file perms each time
a new kernel got installed), modules can only be blacklisted (not
whitelisted) in /etc/modprobe.d/, etc.

> Because that "mode" flag defaulting to off will just mean that the
> default case will remain the existing unsafe one, and that's bad.
>
> Opt-in really doesn't work. We've done it.
>
> Global flags for varied behavior really doesn't work. We've done that
> too. Different cases want different behavior, the global flag is just
> fundamentally broken.

I don't disagree that a global should be avoided, but I'm struggling
to see another option here. We can't break userspace by default so we
can't restrict cap-less loading by default. But we can allow userspace
to _choose_ to break itself, especially within a container. This isn't
uncommon, especially for modules, where we even have the global
"modules_disabled" sysctl already. The level of granularity of control
here is the issue, and it's what this series solves.

The options I see for module loading control are:
1) monolithic kernel (no modules)
2) modular kernel that flips on modules_disabled after boot (no
modules after boot)
3) modular kernel that allows per-subsystem unpriv module loading (all
modules loadable)

There is a demand for something between 2 and 3 where only root can
load modules. (And as pointed out in the series, this is _especially_
true for containers where the admin may want to leave module loading
alone in the init namespace, but stop any module loading in the
container.)

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 22:59           ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-27 22:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 2:04 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 1:41 PM, Djalal Harouni <tixxdz@gmail.com> wrote:
>>
>> However, we are trying hard to abstract some semantics that are easy
>> to grasp, we are mutating capabilities and seccomp to have an
>> abstracted "yes/no" options for our endusers.
>
> Yes.
>
> Sadly, it looks like we actually do have users that just expect to
> load modules dynamically without any capabilities at all.
>
> So we can't actually disallow it by default at all, which imho makes
> this security option essentially useless.

The good news here is that this series comes with an expected user:
systemd, for which I suspect this will option will get wider and wider
use.

> A security option that people can't use without breaking their system
> is pointless.

Another bit of good news is that systems _depending_ cap-less module
loading are uncommon enough that many system builders can enable this
protection and nothing will break. (Even you thought it was rare
enough that we could just make this default-enabled.)

Besides, distros understand that they have to keep an eye out for
these things since the kernel has a long history of not allowing
default-enabled protections that change userspace behavior, etc.

> So I wonder if we can perhaps look at the places that actually do
> "requerst_module()", and start filtering them on that basis.
>
> Some of them will already have checked for capabilities.
>
> Others clearly expect to just work even _without_ capabilities (ie
> the bluetoothd case).
>
> So the whole "let's add a global config option" model is broken. There
> is no possible global rule. It will break things, which in turn mean
> that people won't turn it on (and we can't turn it on by default),
> which in turn makes this pointless.
>
> In other words, I really think that anything that just adds a mode
> flag cannot work.
>
> So instead of having one "modules_autoload_mode" thing, maybe the
> individual requerst_module() cases need to simply be audited.
>
> Put another way: I think the part of your patch series that does that
> "request_module_cap()" and makes the netdev modules use it is a good
> addition.
>
> It's the "mode" part I really don't agree with, because apparently we
> really need to default it to permissive.
>
> So how about instead:
>
>  - add that "request_module_cap()" and make the networking code that
> already uses CAP_ADMIN_NET use it.
>
>  - make "request_module()" itself default to being
> "request_module_cap(CAP_SYS_MODULE,..)"
>
>  - make sure that when the capability check fails, we print an error
> message, and then for the ones that trigger, we will audit them and
> see if it's ok.

The issue isn't "is it always okay to cap-less-load this module?" it's
"should the kernel's attack surface be allowed to grown by an
unprivileged user?" Nearly all the request_module() sites have already
been audited and adjusted to avoid _arbitrary_ module loading (usually
by adding module prefixes: netdev-, crypto-, etc), and that greatly
helped reduce the ability for a unprivileged user to load a
known-buggy module, but there will remain many subsystems that
continue to expect to grow the kernel's features on demand within
their subsystem, and sometimes those modules will have bugs that an
unprivileged user can exploit (this history of this happening is quite
real; it's not a theoretical concern). This is especially problematic
for distro kernels where they build tons of modules.

There isn't a way for an admin to say "I only want root to be able to
load modules". They can't delete all the "unused" kernel modules,
because maybe root will want the module, they can't change module file
permissions because the modules are read with init's permissions (and
it would be terrible to repeatedly change all the file perms each time
a new kernel got installed), modules can only be blacklisted (not
whitelisted) in /etc/modprobe.d/, etc.

> Because that "mode" flag defaulting to off will just mean that the
> default case will remain the existing unsafe one, and that's bad.
>
> Opt-in really doesn't work. We've done it.
>
> Global flags for varied behavior really doesn't work. We've done that
> too. Different cases want different behavior, the global flag is just
> fundamentally broken.

I don't disagree that a global should be avoided, but I'm struggling
to see another option here. We can't break userspace by default so we
can't restrict cap-less loading by default. But we can allow userspace
to _choose_ to break itself, especially within a container. This isn't
uncommon, especially for modules, where we even have the global
"modules_disabled" sysctl already. The level of granularity of control
here is the issue, and it's what this series solves.

The options I see for module loading control are:
1) monolithic kernel (no modules)
2) modular kernel that flips on modules_disabled after boot (no
modules after boot)
3) modular kernel that allows per-subsystem unpriv module loading (all
modules loadable)

There is a demand for something between 2 and 3 where only root can
load modules. (And as pointed out in the series, this is _especially_
true for containers where the admin may want to leave module loading
alone in the init namespace, but stop any module loading in the
container.)

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
  2017-11-27 22:31       ` James Morris
  (?)
@ 2017-11-27 23:04         ` Kees Cook
  -1 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-27 23:04 UTC (permalink / raw)
  To: James Morris, Linus Torvalds
  Cc: David Miller, Djalal Harouni, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, Ben Hutchings, Solar Designer,
	Serge E. Hallyn, Jessica Yu, Rusty Russell, LKML,
	linux-security-module, kernel-hardening, Jonathan Corbet,
	Ingo Molnar, Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 2:31 PM, James Morris <james.l.morris@oracle.com> wrote:
> On Tue, 28 Nov 2017, David Miller wrote:
>
>> From: Linus Torvalds <torvalds@linux-foundation.org>
>> Date: Mon, 27 Nov 2017 10:41:30 -0800
>>
>> > What are the real life use-cases for normal users having modules
>> > auto-load?
>>
>> User opens SCTP socket, SCTP protocol module loads.
>>
>> People build test cases via namespaces, and in that namespaces normal
>> users can setup virtual tunnel devices themselves, and those configure
>> operations can bring the tunnel module in.
>
> What about implementing a white list of modules which are able to be
> loaded by unprivileged users?
>
> Then, Linus' solution would look something like:
>
>         va_start(args, fmt);
>         ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
>         va_end(args);
>
>         if (WARN_ON_ONCE(!capable(CAP_SYS_MODULE) ||
>                          !capable(CAP_SYS_ADMIN) ||
>                          !capable(CAP_NET_ADMIN) ||
>                          !unprivileged_autoload(module_name)))
>                 return -EPERM;

We have some of this already with the module prefixes. Doing this
per-module would need to be exported to userspace, I think. It'd be
way too fragile sitting in the kernel.

To control this via modprobe, we'd need to expand modprobe to include
the user that is trying to load the module (so it can reason about who
is doing it), and then teach modprobe about that so the policy could
be exported to userspace.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 23:04         ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-27 23:04 UTC (permalink / raw)
  To: linux-security-module

On Mon, Nov 27, 2017 at 2:31 PM, James Morris <james.l.morris@oracle.com> wrote:
> On Tue, 28 Nov 2017, David Miller wrote:
>
>> From: Linus Torvalds <torvalds@linux-foundation.org>
>> Date: Mon, 27 Nov 2017 10:41:30 -0800
>>
>> > What are the real life use-cases for normal users having modules
>> > auto-load?
>>
>> User opens SCTP socket, SCTP protocol module loads.
>>
>> People build test cases via namespaces, and in that namespaces normal
>> users can setup virtual tunnel devices themselves, and those configure
>> operations can bring the tunnel module in.
>
> What about implementing a white list of modules which are able to be
> loaded by unprivileged users?
>
> Then, Linus' solution would look something like:
>
>         va_start(args, fmt);
>         ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
>         va_end(args);
>
>         if (WARN_ON_ONCE(!capable(CAP_SYS_MODULE) ||
>                          !capable(CAP_SYS_ADMIN) ||
>                          !capable(CAP_NET_ADMIN) ||
>                          !unprivileged_autoload(module_name)))
>                 return -EPERM;

We have some of this already with the module prefixes. Doing this
per-module would need to be exported to userspace, I think. It'd be
way too fragile sitting in the kernel.

To control this via modprobe, we'd need to expand modprobe to include
the user that is trying to load the module (so it can reason about who
is doing it), and then teach modprobe about that so the policy could
be exported to userspace.

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 23:04         ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-27 23:04 UTC (permalink / raw)
  To: James Morris, Linus Torvalds
  Cc: David Miller, Djalal Harouni, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, Ben Hutchings, Solar Designer,
	Serge E. Hallyn, Jessica Yu, Rusty Russell, LKML,
	linux-security-module, kernel-hardening, Jonathan Corbet,
	Ingo Molnar, Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 2:31 PM, James Morris <james.l.morris@oracle.com> wrote:
> On Tue, 28 Nov 2017, David Miller wrote:
>
>> From: Linus Torvalds <torvalds@linux-foundation.org>
>> Date: Mon, 27 Nov 2017 10:41:30 -0800
>>
>> > What are the real life use-cases for normal users having modules
>> > auto-load?
>>
>> User opens SCTP socket, SCTP protocol module loads.
>>
>> People build test cases via namespaces, and in that namespaces normal
>> users can setup virtual tunnel devices themselves, and those configure
>> operations can bring the tunnel module in.
>
> What about implementing a white list of modules which are able to be
> loaded by unprivileged users?
>
> Then, Linus' solution would look something like:
>
>         va_start(args, fmt);
>         ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
>         va_end(args);
>
>         if (WARN_ON_ONCE(!capable(CAP_SYS_MODULE) ||
>                          !capable(CAP_SYS_ADMIN) ||
>                          !capable(CAP_NET_ADMIN) ||
>                          !unprivileged_autoload(module_name)))
>                 return -EPERM;

We have some of this already with the module prefixes. Doing this
per-module would need to be exported to userspace, I think. It'd be
way too fragile sitting in the kernel.

To control this via modprobe, we'd need to expand modprobe to include
the user that is trying to load the module (so it can reason about who
is doing it), and then teach modprobe about that so the policy could
be exported to userspace.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-27 22:59           ` Kees Cook
  (?)
  (?)
@ 2017-11-27 23:14             ` Linus Torvalds
  -1 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 23:14 UTC (permalink / raw)
  To: Kees Cook
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 2:59 PM, Kees Cook <keescook@chromium.org> wrote:
>
> I don't disagree that a global should be avoided, but I'm struggling
> to see another option here. We can't break userspace by default so we
> can't restrict cap-less loading by default. But we can allow userspace
> to _choose_ to break itself, especially within a container. This isn't
> uncommon, especially for modules, where we even have the global
> "modules_disabled" sysctl already. The level of granularity of control
> here is the issue, and it's what this series solves.

So there's two "global" here

 - if a container were to choose to break itself, it should damn well
be container-specific, not some global option

   This part seems to be ok in the patch series, since the "global" is
really per-task. So it's not global in the "system-wide" sense.

 - if _one_ request_module() caller were to say "I don't want to be
loaded by a normal user", that doesn't mean that _other_
request_module() cases shouldn't.

   This is the part I'm objecting to, because it means that we can't
enable this stricter policy by default.

And the thing is, the patch series seems to already introduce largely
the better model of just making it site-specific. Introducing that
request_module_cap() thing and then using it for networking is a good
step.

But I also suspect that we _could_ just make the stricter rules
actually be default, if we just fixed the thing up to not be "every
request_module() is the same".

For example, several request_module() calls come from device node
opens, and it makes sense that we can just say: "if you have access to
the device node, then you have the right to request the module".

But that would need to be not a global "request_module()" behavior,
but a behavior that is tied to the particular call-site.

IOW, extend on that request_module_cap() model, and introduce
(perhaps) a "request_module_dev()" call that basically means "the user
opened the device node for the requested module".

Because those kinds of permissions aren't necessarily about
capabilities, but about things like "I'm in the dialout group, I get
to open tty devices and by implication request their modules".

And that _really_ isn't global behavior.  The fact that I might be
able to load a serial; module has *nothing* to do with whether I can
load some other kind of module at all.

That global mode is just wrong.

                Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 23:14             ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 23:14 UTC (permalink / raw)
  To: Kees Cook
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 2:59 PM, Kees Cook <keescook@chromium.org> wrote:
>
> I don't disagree that a global should be avoided, but I'm struggling
> to see another option here. We can't break userspace by default so we
> can't restrict cap-less loading by default. But we can allow userspace
> to _choose_ to break itself, especially within a container. This isn't
> uncommon, especially for modules, where we even have the global
> "modules_disabled" sysctl already. The level of granularity of control
> here is the issue, and it's what this series solves.

So there's two "global" here

 - if a container were to choose to break itself, it should damn well
be container-specific, not some global option

   This part seems to be ok in the patch series, since the "global" is
really per-task. So it's not global in the "system-wide" sense.

 - if _one_ request_module() caller were to say "I don't want to be
loaded by a normal user", that doesn't mean that _other_
request_module() cases shouldn't.

   This is the part I'm objecting to, because it means that we can't
enable this stricter policy by default.

And the thing is, the patch series seems to already introduce largely
the better model of just making it site-specific. Introducing that
request_module_cap() thing and then using it for networking is a good
step.

But I also suspect that we _could_ just make the stricter rules
actually be default, if we just fixed the thing up to not be "every
request_module() is the same".

For example, several request_module() calls come from device node
opens, and it makes sense that we can just say: "if you have access to
the device node, then you have the right to request the module".

But that would need to be not a global "request_module()" behavior,
but a behavior that is tied to the particular call-site.

IOW, extend on that request_module_cap() model, and introduce
(perhaps) a "request_module_dev()" call that basically means "the user
opened the device node for the requested module".

Because those kinds of permissions aren't necessarily about
capabilities, but about things like "I'm in the dialout group, I get
to open tty devices and by implication request their modules".

And that _really_ isn't global behavior.  The fact that I might be
able to load a serial; module has *nothing* to do with whether I can
load some other kind of module at all.

That global mode is just wrong.

                Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 23:14             ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 23:14 UTC (permalink / raw)
  To: linux-security-module

On Mon, Nov 27, 2017 at 2:59 PM, Kees Cook <keescook@chromium.org> wrote:
>
> I don't disagree that a global should be avoided, but I'm struggling
> to see another option here. We can't break userspace by default so we
> can't restrict cap-less loading by default. But we can allow userspace
> to _choose_ to break itself, especially within a container. This isn't
> uncommon, especially for modules, where we even have the global
> "modules_disabled" sysctl already. The level of granularity of control
> here is the issue, and it's what this series solves.

So there's two "global" here

 - if a container were to choose to break itself, it should damn well
be container-specific, not some global option

   This part seems to be ok in the patch series, since the "global" is
really per-task. So it's not global in the "system-wide" sense.

 - if _one_ request_module() caller were to say "I don't want to be
loaded by a normal user", that doesn't mean that _other_
request_module() cases shouldn't.

   This is the part I'm objecting to, because it means that we can't
enable this stricter policy by default.

And the thing is, the patch series seems to already introduce largely
the better model of just making it site-specific. Introducing that
request_module_cap() thing and then using it for networking is a good
step.

But I also suspect that we _could_ just make the stricter rules
actually be default, if we just fixed the thing up to not be "every
request_module() is the same".

For example, several request_module() calls come from device node
opens, and it makes sense that we can just say: "if you have access to
the device node, then you have the right to request the module".

But that would need to be not a global "request_module()" behavior,
but a behavior that is tied to the particular call-site.

IOW, extend on that request_module_cap() model, and introduce
(perhaps) a "request_module_dev()" call that basically means "the user
opened the device node for the requested module".

Because those kinds of permissions aren't necessarily about
capabilities, but about things like "I'm in the dialout group, I get
to open tty devices and by implication request their modules".

And that _really_ isn't global behavior.  The fact that I might be
able to load a serial; module has *nothing* to do with whether I can
load some other kind of module at all.

That global mode is just wrong.

                Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 23:14             ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 23:14 UTC (permalink / raw)
  To: Kees Cook
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 2:59 PM, Kees Cook <keescook@chromium.org> wrote:
>
> I don't disagree that a global should be avoided, but I'm struggling
> to see another option here. We can't break userspace by default so we
> can't restrict cap-less loading by default. But we can allow userspace
> to _choose_ to break itself, especially within a container. This isn't
> uncommon, especially for modules, where we even have the global
> "modules_disabled" sysctl already. The level of granularity of control
> here is the issue, and it's what this series solves.

So there's two "global" here

 - if a container were to choose to break itself, it should damn well
be container-specific, not some global option

   This part seems to be ok in the patch series, since the "global" is
really per-task. So it's not global in the "system-wide" sense.

 - if _one_ request_module() caller were to say "I don't want to be
loaded by a normal user", that doesn't mean that _other_
request_module() cases shouldn't.

   This is the part I'm objecting to, because it means that we can't
enable this stricter policy by default.

And the thing is, the patch series seems to already introduce largely
the better model of just making it site-specific. Introducing that
request_module_cap() thing and then using it for networking is a good
step.

But I also suspect that we _could_ just make the stricter rules
actually be default, if we just fixed the thing up to not be "every
request_module() is the same".

For example, several request_module() calls come from device node
opens, and it makes sense that we can just say: "if you have access to
the device node, then you have the right to request the module".

But that would need to be not a global "request_module()" behavior,
but a behavior that is tied to the particular call-site.

IOW, extend on that request_module_cap() model, and introduce
(perhaps) a "request_module_dev()" call that basically means "the user
opened the device node for the requested module".

Because those kinds of permissions aren't necessarily about
capabilities, but about things like "I'm in the dialout group, I get
to open tty devices and by implication request their modules".

And that _really_ isn't global behavior.  The fact that I might be
able to load a serial; module has *nothing* to do with whether I can
load some other kind of module at all.

That global mode is just wrong.

                Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-27 23:14             ` Linus Torvalds
  (?)
  (?)
@ 2017-11-27 23:19               ` Kees Cook
  -1 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-27 23:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 3:14 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 2:59 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>> I don't disagree that a global should be avoided, but I'm struggling
>> to see another option here. We can't break userspace by default so we
>> can't restrict cap-less loading by default. But we can allow userspace
>> to _choose_ to break itself, especially within a container. This isn't
>> uncommon, especially for modules, where we even have the global
>> "modules_disabled" sysctl already. The level of granularity of control
>> here is the issue, and it's what this series solves.
>
> So there's two "global" here
>
>  - if a container were to choose to break itself, it should damn well
> be container-specific, not some global option
>
>    This part seems to be ok in the patch series, since the "global" is
> really per-task. So it's not global in the "system-wide" sense.
>
>  - if _one_ request_module() caller were to say "I don't want to be
> loaded by a normal user", that doesn't mean that _other_
> request_module() cases shouldn't.
>
>    This is the part I'm objecting to, because it means that we can't
> enable this stricter policy by default.
>
> And the thing is, the patch series seems to already introduce largely
> the better model of just making it site-specific. Introducing that
> request_module_cap() thing and then using it for networking is a good
> step.
>
> But I also suspect that we _could_ just make the stricter rules
> actually be default, if we just fixed the thing up to not be "every
> request_module() is the same".
>
> For example, several request_module() calls come from device node
> opens, and it makes sense that we can just say: "if you have access to
> the device node, then you have the right to request the module".
>
> But that would need to be not a global "request_module()" behavior,
> but a behavior that is tied to the particular call-site.
>
> IOW, extend on that request_module_cap() model, and introduce
> (perhaps) a "request_module_dev()" call that basically means "the user
> opened the device node for the requested module".
>
> Because those kinds of permissions aren't necessarily about
> capabilities, but about things like "I'm in the dialout group, I get
> to open tty devices and by implication request their modules".
>
> And that _really_ isn't global behavior.  The fact that I might be
> able to load a serial; module has *nothing* to do with whether I can
> load some other kind of module at all.
>
> That global mode is just wrong.

What about exporting this entirely to userspace, giving it as much
context as possible? i.e. inform modprobe about the user doing it,
maybe the subsystem, etc?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 23:19               ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-27 23:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 3:14 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 2:59 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>> I don't disagree that a global should be avoided, but I'm struggling
>> to see another option here. We can't break userspace by default so we
>> can't restrict cap-less loading by default. But we can allow userspace
>> to _choose_ to break itself, especially within a container. This isn't
>> uncommon, especially for modules, where we even have the global
>> "modules_disabled" sysctl already. The level of granularity of control
>> here is the issue, and it's what this series solves.
>
> So there's two "global" here
>
>  - if a container were to choose to break itself, it should damn well
> be container-specific, not some global option
>
>    This part seems to be ok in the patch series, since the "global" is
> really per-task. So it's not global in the "system-wide" sense.
>
>  - if _one_ request_module() caller were to say "I don't want to be
> loaded by a normal user", that doesn't mean that _other_
> request_module() cases shouldn't.
>
>    This is the part I'm objecting to, because it means that we can't
> enable this stricter policy by default.
>
> And the thing is, the patch series seems to already introduce largely
> the better model of just making it site-specific. Introducing that
> request_module_cap() thing and then using it for networking is a good
> step.
>
> But I also suspect that we _could_ just make the stricter rules
> actually be default, if we just fixed the thing up to not be "every
> request_module() is the same".
>
> For example, several request_module() calls come from device node
> opens, and it makes sense that we can just say: "if you have access to
> the device node, then you have the right to request the module".
>
> But that would need to be not a global "request_module()" behavior,
> but a behavior that is tied to the particular call-site.
>
> IOW, extend on that request_module_cap() model, and introduce
> (perhaps) a "request_module_dev()" call that basically means "the user
> opened the device node for the requested module".
>
> Because those kinds of permissions aren't necessarily about
> capabilities, but about things like "I'm in the dialout group, I get
> to open tty devices and by implication request their modules".
>
> And that _really_ isn't global behavior.  The fact that I might be
> able to load a serial; module has *nothing* to do with whether I can
> load some other kind of module at all.
>
> That global mode is just wrong.

What about exporting this entirely to userspace, giving it as much
context as possible? i.e. inform modprobe about the user doing it,
maybe the subsystem, etc?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 23:19               ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-27 23:19 UTC (permalink / raw)
  To: linux-security-module

On Mon, Nov 27, 2017 at 3:14 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 2:59 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>> I don't disagree that a global should be avoided, but I'm struggling
>> to see another option here. We can't break userspace by default so we
>> can't restrict cap-less loading by default. But we can allow userspace
>> to _choose_ to break itself, especially within a container. This isn't
>> uncommon, especially for modules, where we even have the global
>> "modules_disabled" sysctl already. The level of granularity of control
>> here is the issue, and it's what this series solves.
>
> So there's two "global" here
>
>  - if a container were to choose to break itself, it should damn well
> be container-specific, not some global option
>
>    This part seems to be ok in the patch series, since the "global" is
> really per-task. So it's not global in the "system-wide" sense.
>
>  - if _one_ request_module() caller were to say "I don't want to be
> loaded by a normal user", that doesn't mean that _other_
> request_module() cases shouldn't.
>
>    This is the part I'm objecting to, because it means that we can't
> enable this stricter policy by default.
>
> And the thing is, the patch series seems to already introduce largely
> the better model of just making it site-specific. Introducing that
> request_module_cap() thing and then using it for networking is a good
> step.
>
> But I also suspect that we _could_ just make the stricter rules
> actually be default, if we just fixed the thing up to not be "every
> request_module() is the same".
>
> For example, several request_module() calls come from device node
> opens, and it makes sense that we can just say: "if you have access to
> the device node, then you have the right to request the module".
>
> But that would need to be not a global "request_module()" behavior,
> but a behavior that is tied to the particular call-site.
>
> IOW, extend on that request_module_cap() model, and introduce
> (perhaps) a "request_module_dev()" call that basically means "the user
> opened the device node for the requested module".
>
> Because those kinds of permissions aren't necessarily about
> capabilities, but about things like "I'm in the dialout group, I get
> to open tty devices and by implication request their modules".
>
> And that _really_ isn't global behavior.  The fact that I might be
> able to load a serial; module has *nothing* to do with whether I can
> load some other kind of module at all.
>
> That global mode is just wrong.

What about exporting this entirely to userspace, giving it as much
context as possible? i.e. inform modprobe about the user doing it,
maybe the subsystem, etc?

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 23:19               ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-27 23:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 3:14 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 2:59 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>> I don't disagree that a global should be avoided, but I'm struggling
>> to see another option here. We can't break userspace by default so we
>> can't restrict cap-less loading by default. But we can allow userspace
>> to _choose_ to break itself, especially within a container. This isn't
>> uncommon, especially for modules, where we even have the global
>> "modules_disabled" sysctl already. The level of granularity of control
>> here is the issue, and it's what this series solves.
>
> So there's two "global" here
>
>  - if a container were to choose to break itself, it should damn well
> be container-specific, not some global option
>
>    This part seems to be ok in the patch series, since the "global" is
> really per-task. So it's not global in the "system-wide" sense.
>
>  - if _one_ request_module() caller were to say "I don't want to be
> loaded by a normal user", that doesn't mean that _other_
> request_module() cases shouldn't.
>
>    This is the part I'm objecting to, because it means that we can't
> enable this stricter policy by default.
>
> And the thing is, the patch series seems to already introduce largely
> the better model of just making it site-specific. Introducing that
> request_module_cap() thing and then using it for networking is a good
> step.
>
> But I also suspect that we _could_ just make the stricter rules
> actually be default, if we just fixed the thing up to not be "every
> request_module() is the same".
>
> For example, several request_module() calls come from device node
> opens, and it makes sense that we can just say: "if you have access to
> the device node, then you have the right to request the module".
>
> But that would need to be not a global "request_module()" behavior,
> but a behavior that is tied to the particular call-site.
>
> IOW, extend on that request_module_cap() model, and introduce
> (perhaps) a "request_module_dev()" call that basically means "the user
> opened the device node for the requested module".
>
> Because those kinds of permissions aren't necessarily about
> capabilities, but about things like "I'm in the dialout group, I get
> to open tty devices and by implication request their modules".
>
> And that _really_ isn't global behavior.  The fact that I might be
> able to load a serial; module has *nothing* to do with whether I can
> load some other kind of module at all.
>
> That global mode is just wrong.

What about exporting this entirely to userspace, giving it as much
context as possible? i.e. inform modprobe about the user doing it,
maybe the subsystem, etc?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-27 23:19               ` Kees Cook
  (?)
  (?)
@ 2017-11-27 23:35                 ` Linus Torvalds
  -1 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 23:35 UTC (permalink / raw)
  To: Kees Cook
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 3:19 PM, Kees Cook <keescook@chromium.org> wrote:
>
> What about exporting this entirely to userspace, giving it as much
> context as possible? i.e. inform modprobe about the user doing it,
> maybe the subsystem, etc?

Yeah, except for the fact that we don't trust user-mode?

We used to do that exact thing. It was a nasty disaster, and caused
version skew and other horrible problems.

So no. Th e"let's just let user mode sort it out" doesn't work. User
mode doesn't sort anything out, it just makes it worse.

It's not some made-up example when I say that user-mode has decided
that kernel requests have to be completely serialized, and recusive
invocations will just hang.

So no. We do not go down that particular rat-hole. It's just a bigger
chance of getting things wrong.

                Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 23:35                 ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 23:35 UTC (permalink / raw)
  To: Kees Cook
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 3:19 PM, Kees Cook <keescook@chromium.org> wrote:
>
> What about exporting this entirely to userspace, giving it as much
> context as possible? i.e. inform modprobe about the user doing it,
> maybe the subsystem, etc?

Yeah, except for the fact that we don't trust user-mode?

We used to do that exact thing. It was a nasty disaster, and caused
version skew and other horrible problems.

So no. Th e"let's just let user mode sort it out" doesn't work. User
mode doesn't sort anything out, it just makes it worse.

It's not some made-up example when I say that user-mode has decided
that kernel requests have to be completely serialized, and recusive
invocations will just hang.

So no. We do not go down that particular rat-hole. It's just a bigger
chance of getting things wrong.

                Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 23:35                 ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 23:35 UTC (permalink / raw)
  To: linux-security-module

On Mon, Nov 27, 2017 at 3:19 PM, Kees Cook <keescook@chromium.org> wrote:
>
> What about exporting this entirely to userspace, giving it as much
> context as possible? i.e. inform modprobe about the user doing it,
> maybe the subsystem, etc?

Yeah, except for the fact that we don't trust user-mode?

We used to do that exact thing. It was a nasty disaster, and caused
version skew and other horrible problems.

So no. Th e"let's just let user mode sort it out" doesn't work. User
mode doesn't sort anything out, it just makes it worse.

It's not some made-up example when I say that user-mode has decided
that kernel requests have to be completely serialized, and recusive
invocations will just hang.

So no. We do not go down that particular rat-hole. It's just a bigger
chance of getting things wrong.

                Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-27 23:35                 ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-27 23:35 UTC (permalink / raw)
  To: Kees Cook
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 3:19 PM, Kees Cook <keescook@chromium.org> wrote:
>
> What about exporting this entirely to userspace, giving it as much
> context as possible? i.e. inform modprobe about the user doing it,
> maybe the subsystem, etc?

Yeah, except for the fact that we don't trust user-mode?

We used to do that exact thing. It was a nasty disaster, and caused
version skew and other horrible problems.

So no. Th e"let's just let user mode sort it out" doesn't work. User
mode doesn't sort anything out, it just makes it worse.

It's not some made-up example when I say that user-mode has decided
that kernel requests have to be completely serialized, and recusive
invocations will just hang.

So no. We do not go down that particular rat-hole. It's just a bigger
chance of getting things wrong.

                Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
  2017-11-27 23:04         ` Kees Cook
  (?)
@ 2017-11-27 23:44           ` James Morris
  -1 siblings, 0 replies; 266+ messages in thread
From: James Morris @ 2017-11-27 23:44 UTC (permalink / raw)
  To: Kees Cook
  Cc: Linus Torvalds, David Miller, Djalal Harouni, Andy Lutomirski,
	Andrew Morton, Luis R. Rodriguez, Ben Hutchings, Solar Designer,
	Serge E. Hallyn, Jessica Yu, Rusty Russell, LKML,
	linux-security-module, kernel-hardening, Jonathan Corbet,
	Ingo Molnar, Network Development, Peter Zijlstra

On Mon, 27 Nov 2017, Kees Cook wrote:

> >         if (WARN_ON_ONCE(!capable(CAP_SYS_MODULE) ||
> >                          !capable(CAP_SYS_ADMIN) ||
> >                          !capable(CAP_NET_ADMIN) ||
> >                          !unprivileged_autoload(module_name)))

(Side note: the capable() calls would ideally come after the whitelist 
check).

> We have some of this already with the module prefixes. Doing this
> per-module would need to be exported to userspace, I think. It'd be
> way too fragile sitting in the kernel.

What about writing a whitelist to /proc (per-task) or /sys/fs (global) ?

The per-task whitelist is inherited from the global one by default, or 
from a parent process if it's been modified in the parent.


-- 
James Morris
<james.l.morris@oracle.com>

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 23:44           ` James Morris
  0 siblings, 0 replies; 266+ messages in thread
From: James Morris @ 2017-11-27 23:44 UTC (permalink / raw)
  To: linux-security-module

On Mon, 27 Nov 2017, Kees Cook wrote:

> >         if (WARN_ON_ONCE(!capable(CAP_SYS_MODULE) ||
> >                          !capable(CAP_SYS_ADMIN) ||
> >                          !capable(CAP_NET_ADMIN) ||
> >                          !unprivileged_autoload(module_name)))

(Side note: the capable() calls would ideally come after the whitelist 
check).

> We have some of this already with the module prefixes. Doing this
> per-module would need to be exported to userspace, I think. It'd be
> way too fragile sitting in the kernel.

What about writing a whitelist to /proc (per-task) or /sys/fs (global) ?

The per-task whitelist is inherited from the global one by default, or 
from a parent process if it's been modified in the parent.


-- 
James Morris
<james.l.morris@oracle.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 0/5] Improve Module autoloading infrastructure
@ 2017-11-27 23:44           ` James Morris
  0 siblings, 0 replies; 266+ messages in thread
From: James Morris @ 2017-11-27 23:44 UTC (permalink / raw)
  To: Kees Cook
  Cc: Linus Torvalds, David Miller, Djalal Harouni, Andy Lutomirski,
	Andrew Morton, Luis R. Rodriguez, Ben Hutchings, Solar Designer,
	Serge E. Hallyn, Jessica Yu, Rusty Russell, LKML,
	linux-security-module, kernel-hardening, Jonathan Corbet,
	Ingo Molnar, Network Development, Peter Zijlstra

On Mon, 27 Nov 2017, Kees Cook wrote:

> >         if (WARN_ON_ONCE(!capable(CAP_SYS_MODULE) ||
> >                          !capable(CAP_SYS_ADMIN) ||
> >                          !capable(CAP_NET_ADMIN) ||
> >                          !unprivileged_autoload(module_name)))

(Side note: the capable() calls would ideally come after the whitelist 
check).

> We have some of this already with the module prefixes. Doing this
> per-module would need to be exported to userspace, I think. It'd be
> way too fragile sitting in the kernel.

What about writing a whitelist to /proc (per-task) or /sys/fs (global) ?

The per-task whitelist is inherited from the global one by default, or 
from a parent process if it's been modified in the parent.


-- 
James Morris
<james.l.morris@oracle.com>

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-27 23:14             ` Linus Torvalds
  (?)
  (?)
@ 2017-11-28  1:23               ` Kees Cook
  -1 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28  1:23 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 3:14 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 2:59 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>> I don't disagree that a global should be avoided, but I'm struggling
>> to see another option here. We can't break userspace by default so we
>> can't restrict cap-less loading by default. But we can allow userspace
>> to _choose_ to break itself, especially within a container. This isn't
>> uncommon, especially for modules, where we even have the global
>> "modules_disabled" sysctl already. The level of granularity of control
>> here is the issue, and it's what this series solves.
>
> So there's two "global" here
>
>  - if a container were to choose to break itself, it should damn well
> be container-specific, not some global option
>
>    This part seems to be ok in the patch series, since the "global" is
> really per-task. So it's not global in the "system-wide" sense.

Right, though in the case of init, it could flip that toggle for
itself and it would then effectively be system-wide.

>  - if _one_ request_module() caller were to say "I don't want to be
> loaded by a normal user", that doesn't mean that _other_
> request_module() cases shouldn't.
>
>    This is the part I'm objecting to, because it means that we can't
> enable this stricter policy by default.

Okay, I see what you mean here. You want to clearly distinguish
between unprivileged and privileged-only. I'm unconvinced that's going
to change much, as the bulk of the exposed request_module() users are
already expecting to be unprivileged (and that's why they were all
converted to requiring a named prefix).

> And the thing is, the patch series seems to already introduce largely
> the better model of just making it site-specific. Introducing that
> request_module_cap() thing and then using it for networking is a good
> step.
>
> But I also suspect that we _could_ just make the stricter rules
> actually be default, if we just fixed the thing up to not be "every
> request_module() is the same".

When doing some of the older module name prefix work, I did consider
introducing a new request_module() API that included the prefix name
as an explicit argument (instead of embedding it in the format
string). We could easily start there, and then have "plain"
request_module() require privs. But we'll still need a way to say
"admin doesn't want unpriv module auto-loading".

> For example, several request_module() calls come from device node
> opens, and it makes sense that we can just say: "if you have access to
> the device node, then you have the right to request the module".

Many of these callers are using network interfaces to do this work, so
there isn't as clean a permission model associated with those like
there might be with a filesystem open(). But that doesn't matter (see
below).

> But that would need to be not a global "request_module()" behavior,
> but a behavior that is tied to the particular call-site.
>
> IOW, extend on that request_module_cap() model, and introduce
> (perhaps) a "request_module_dev()" call that basically means "the user
> opened the device node for the requested module".
>
> Because those kinds of permissions aren't necessarily about
> capabilities, but about things like "I'm in the dialout group, I get
> to open tty devices and by implication request their modules".

This really doesn't address the main concern that is the problem:
whitelisting vs blacklisting. In your example, the dialout group gives
access to specific ttys or serial ports, etc, but an admin may want a
way to make sure the users don't load some buggy line discipline. Now,
that admin could blacklist all those modules one at a time, but new
stuff might get introduced, it doesn't handle other subsystems, etc.

We need to provide a way to whitelist autoloaded modules. The
demonstrated need (to whitelist _no_ modules, addressed by this
series) provides that level of control on a task basis (effectively
making it container-specific).

> And that _really_ isn't global behavior.  The fact that I might be
> able to load a serial; module has *nothing* to do with whether I can
> load some other kind of module at all.
>
> That global mode is just wrong.

If the per-task thing stays and the global sysctl goes, that would be
fine by me. That still gives admins a way to control the autoload
behavior, assuming their init knows how to set the flag. The global
sysctl, in my mind, is really more of a way for an admin to do this
after the fact without rebooting, etc. But I don't have a strong
opinion about the global sysctl.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28  1:23               ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28  1:23 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 3:14 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 2:59 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>> I don't disagree that a global should be avoided, but I'm struggling
>> to see another option here. We can't break userspace by default so we
>> can't restrict cap-less loading by default. But we can allow userspace
>> to _choose_ to break itself, especially within a container. This isn't
>> uncommon, especially for modules, where we even have the global
>> "modules_disabled" sysctl already. The level of granularity of control
>> here is the issue, and it's what this series solves.
>
> So there's two "global" here
>
>  - if a container were to choose to break itself, it should damn well
> be container-specific, not some global option
>
>    This part seems to be ok in the patch series, since the "global" is
> really per-task. So it's not global in the "system-wide" sense.

Right, though in the case of init, it could flip that toggle for
itself and it would then effectively be system-wide.

>  - if _one_ request_module() caller were to say "I don't want to be
> loaded by a normal user", that doesn't mean that _other_
> request_module() cases shouldn't.
>
>    This is the part I'm objecting to, because it means that we can't
> enable this stricter policy by default.

Okay, I see what you mean here. You want to clearly distinguish
between unprivileged and privileged-only. I'm unconvinced that's going
to change much, as the bulk of the exposed request_module() users are
already expecting to be unprivileged (and that's why they were all
converted to requiring a named prefix).

> And the thing is, the patch series seems to already introduce largely
> the better model of just making it site-specific. Introducing that
> request_module_cap() thing and then using it for networking is a good
> step.
>
> But I also suspect that we _could_ just make the stricter rules
> actually be default, if we just fixed the thing up to not be "every
> request_module() is the same".

When doing some of the older module name prefix work, I did consider
introducing a new request_module() API that included the prefix name
as an explicit argument (instead of embedding it in the format
string). We could easily start there, and then have "plain"
request_module() require privs. But we'll still need a way to say
"admin doesn't want unpriv module auto-loading".

> For example, several request_module() calls come from device node
> opens, and it makes sense that we can just say: "if you have access to
> the device node, then you have the right to request the module".

Many of these callers are using network interfaces to do this work, so
there isn't as clean a permission model associated with those like
there might be with a filesystem open(). But that doesn't matter (see
below).

> But that would need to be not a global "request_module()" behavior,
> but a behavior that is tied to the particular call-site.
>
> IOW, extend on that request_module_cap() model, and introduce
> (perhaps) a "request_module_dev()" call that basically means "the user
> opened the device node for the requested module".
>
> Because those kinds of permissions aren't necessarily about
> capabilities, but about things like "I'm in the dialout group, I get
> to open tty devices and by implication request their modules".

This really doesn't address the main concern that is the problem:
whitelisting vs blacklisting. In your example, the dialout group gives
access to specific ttys or serial ports, etc, but an admin may want a
way to make sure the users don't load some buggy line discipline. Now,
that admin could blacklist all those modules one at a time, but new
stuff might get introduced, it doesn't handle other subsystems, etc.

We need to provide a way to whitelist autoloaded modules. The
demonstrated need (to whitelist _no_ modules, addressed by this
series) provides that level of control on a task basis (effectively
making it container-specific).

> And that _really_ isn't global behavior.  The fact that I might be
> able to load a serial; module has *nothing* to do with whether I can
> load some other kind of module at all.
>
> That global mode is just wrong.

If the per-task thing stays and the global sysctl goes, that would be
fine by me. That still gives admins a way to control the autoload
behavior, assuming their init knows how to set the flag. The global
sysctl, in my mind, is really more of a way for an admin to do this
after the fact without rebooting, etc. But I don't have a strong
opinion about the global sysctl.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28  1:23               ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28  1:23 UTC (permalink / raw)
  To: linux-security-module

On Mon, Nov 27, 2017 at 3:14 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 2:59 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>> I don't disagree that a global should be avoided, but I'm struggling
>> to see another option here. We can't break userspace by default so we
>> can't restrict cap-less loading by default. But we can allow userspace
>> to _choose_ to break itself, especially within a container. This isn't
>> uncommon, especially for modules, where we even have the global
>> "modules_disabled" sysctl already. The level of granularity of control
>> here is the issue, and it's what this series solves.
>
> So there's two "global" here
>
>  - if a container were to choose to break itself, it should damn well
> be container-specific, not some global option
>
>    This part seems to be ok in the patch series, since the "global" is
> really per-task. So it's not global in the "system-wide" sense.

Right, though in the case of init, it could flip that toggle for
itself and it would then effectively be system-wide.

>  - if _one_ request_module() caller were to say "I don't want to be
> loaded by a normal user", that doesn't mean that _other_
> request_module() cases shouldn't.
>
>    This is the part I'm objecting to, because it means that we can't
> enable this stricter policy by default.

Okay, I see what you mean here. You want to clearly distinguish
between unprivileged and privileged-only. I'm unconvinced that's going
to change much, as the bulk of the exposed request_module() users are
already expecting to be unprivileged (and that's why they were all
converted to requiring a named prefix).

> And the thing is, the patch series seems to already introduce largely
> the better model of just making it site-specific. Introducing that
> request_module_cap() thing and then using it for networking is a good
> step.
>
> But I also suspect that we _could_ just make the stricter rules
> actually be default, if we just fixed the thing up to not be "every
> request_module() is the same".

When doing some of the older module name prefix work, I did consider
introducing a new request_module() API that included the prefix name
as an explicit argument (instead of embedding it in the format
string). We could easily start there, and then have "plain"
request_module() require privs. But we'll still need a way to say
"admin doesn't want unpriv module auto-loading".

> For example, several request_module() calls come from device node
> opens, and it makes sense that we can just say: "if you have access to
> the device node, then you have the right to request the module".

Many of these callers are using network interfaces to do this work, so
there isn't as clean a permission model associated with those like
there might be with a filesystem open(). But that doesn't matter (see
below).

> But that would need to be not a global "request_module()" behavior,
> but a behavior that is tied to the particular call-site.
>
> IOW, extend on that request_module_cap() model, and introduce
> (perhaps) a "request_module_dev()" call that basically means "the user
> opened the device node for the requested module".
>
> Because those kinds of permissions aren't necessarily about
> capabilities, but about things like "I'm in the dialout group, I get
> to open tty devices and by implication request their modules".

This really doesn't address the main concern that is the problem:
whitelisting vs blacklisting. In your example, the dialout group gives
access to specific ttys or serial ports, etc, but an admin may want a
way to make sure the users don't load some buggy line discipline. Now,
that admin could blacklist all those modules one at a time, but new
stuff might get introduced, it doesn't handle other subsystems, etc.

We need to provide a way to whitelist autoloaded modules. The
demonstrated need (to whitelist _no_ modules, addressed by this
series) provides that level of control on a task basis (effectively
making it container-specific).

> And that _really_ isn't global behavior.  The fact that I might be
> able to load a serial; module has *nothing* to do with whether I can
> load some other kind of module at all.
>
> That global mode is just wrong.

If the per-task thing stays and the global sysctl goes, that would be
fine by me. That still gives admins a way to control the autoload
behavior, assuming their init knows how to set the flag. The global
sysctl, in my mind, is really more of a way for an admin to do this
after the fact without rebooting, etc. But I don't have a strong
opinion about the global sysctl.

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28  1:23               ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28  1:23 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton,
	Luis R. Rodriguez, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra

On Mon, Nov 27, 2017 at 3:14 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 27, 2017 at 2:59 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>> I don't disagree that a global should be avoided, but I'm struggling
>> to see another option here. We can't break userspace by default so we
>> can't restrict cap-less loading by default. But we can allow userspace
>> to _choose_ to break itself, especially within a container. This isn't
>> uncommon, especially for modules, where we even have the global
>> "modules_disabled" sysctl already. The level of granularity of control
>> here is the issue, and it's what this series solves.
>
> So there's two "global" here
>
>  - if a container were to choose to break itself, it should damn well
> be container-specific, not some global option
>
>    This part seems to be ok in the patch series, since the "global" is
> really per-task. So it's not global in the "system-wide" sense.

Right, though in the case of init, it could flip that toggle for
itself and it would then effectively be system-wide.

>  - if _one_ request_module() caller were to say "I don't want to be
> loaded by a normal user", that doesn't mean that _other_
> request_module() cases shouldn't.
>
>    This is the part I'm objecting to, because it means that we can't
> enable this stricter policy by default.

Okay, I see what you mean here. You want to clearly distinguish
between unprivileged and privileged-only. I'm unconvinced that's going
to change much, as the bulk of the exposed request_module() users are
already expecting to be unprivileged (and that's why they were all
converted to requiring a named prefix).

> And the thing is, the patch series seems to already introduce largely
> the better model of just making it site-specific. Introducing that
> request_module_cap() thing and then using it for networking is a good
> step.
>
> But I also suspect that we _could_ just make the stricter rules
> actually be default, if we just fixed the thing up to not be "every
> request_module() is the same".

When doing some of the older module name prefix work, I did consider
introducing a new request_module() API that included the prefix name
as an explicit argument (instead of embedding it in the format
string). We could easily start there, and then have "plain"
request_module() require privs. But we'll still need a way to say
"admin doesn't want unpriv module auto-loading".

> For example, several request_module() calls come from device node
> opens, and it makes sense that we can just say: "if you have access to
> the device node, then you have the right to request the module".

Many of these callers are using network interfaces to do this work, so
there isn't as clean a permission model associated with those like
there might be with a filesystem open(). But that doesn't matter (see
below).

> But that would need to be not a global "request_module()" behavior,
> but a behavior that is tied to the particular call-site.
>
> IOW, extend on that request_module_cap() model, and introduce
> (perhaps) a "request_module_dev()" call that basically means "the user
> opened the device node for the requested module".
>
> Because those kinds of permissions aren't necessarily about
> capabilities, but about things like "I'm in the dialout group, I get
> to open tty devices and by implication request their modules".

This really doesn't address the main concern that is the problem:
whitelisting vs blacklisting. In your example, the dialout group gives
access to specific ttys or serial ports, etc, but an admin may want a
way to make sure the users don't load some buggy line discipline. Now,
that admin could blacklist all those modules one at a time, but new
stuff might get introduced, it doesn't handle other subsystems, etc.

We need to provide a way to whitelist autoloaded modules. The
demonstrated need (to whitelist _no_ modules, addressed by this
series) provides that level of control on a task basis (effectively
making it container-specific).

> And that _really_ isn't global behavior.  The fact that I might be
> able to load a serial; module has *nothing* to do with whether I can
> load some other kind of module at all.
>
> That global mode is just wrong.

If the per-task thing stays and the global sysctl goes, that would be
fine by me. That still gives admins a way to control the autoload
behavior, assuming their init knows how to set the flag. The global
sysctl, in my mind, is really more of a way for an admin to do this
after the fact without rebooting, etc. But I don't have a strong
opinion about the global sysctl.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-27 22:04         ` Linus Torvalds
  (?)
@ 2017-11-28 12:16           ` Geo Kozey
  -1 siblings, 0 replies; 266+ messages in thread
From: Geo Kozey @ 2017-11-28 12:16 UTC (permalink / raw)
  To: Linus Torvalds, Djalal Harouni
  Cc: Kees Cook, James Morris, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet

> From: Linus Torvalds <torvalds@linux-foundation.org>
> Sent: Mon Nov 27 23:04:58 CET 2017
> To: Djalal Harouni <tixxdz@gmail.com>
> Cc: Kees Cook <keescook@chromium.org>, Andy Lutomirski <luto@kernel.org>, Andrew Morton <akpm@linux-foundation.org>, Luis R. Rodriguez <mcgrof@kernel.org>, James Morris <james.l.morris@oracle.com>, Ben Hutchings <ben.hutchings@codethink.co.uk>, Solar Designer <solar@openwall.com>, Serge Hallyn <serge@hallyn.com>, Jessica Yu <jeyu@kernel.org>, Rusty Russell <rusty@rustcorp.com.au>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, LSM List <linux-security-module@vger.kernel.org>, kernel-hardening@lists.openwall.com <kernel-hardening@lists.openwall.com>, Jonathan Corbet <corbet@lwn.net>, Ingo Molnar <mingo@kernel.org>, David S. Miller <davem@davemloft.net>, Network Development <netdev@vger.kernel.org>, Peter Zijlstra <peterz@infradead.org>
> Subject: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
> 
> 
> On Mon, Nov 27, 2017 at 1:41 PM, Djalal Harouni <tixxdz@gmail.com> wrote:
> >
> > However, we are trying hard to abstract some semantics that are easy
> > to grasp, we are mutating capabilities and seccomp to have an
> > abstracted "yes/no" options for our endusers.
> 
> Yes.
> 
> Sadly, it looks like we actually do have users that just expect to
> load modules dynamically without any capabilities at all.
> 
> So we can't actually disallow it by default at all, which imho makes
> this security option essentially useless.
> 
> A security option that people can't use without breaking their system
> is pointless.
> 
> We saw that with SELinux - people ended up just disabling it for
> _years_, simply because it ended up breaking so much in practice. And
> yes, it got fixed eventually, but at an incredibly high maintenance
> cost of all the crazy rules databases.
> 
> > Alright, but I guess we are stuck, is there something better on how we
> > can do this or describe this ?
> 
> So I wonder if we can perhaps look at the places that actually do
> "requerst_module()", and start filtering them on that basis.
> 
> Some of them will already have checked for capabilities.
> 
> Others clearly expect to juist work even _without_ capabilities (ie
> the bluetoothd case).
> 
> So the whole "let's add a global config option" model is broken. There
> is no possible global rule. It will break things, which in turn mean
> that people won't turn it on (and we can't turn it on by default),
> which in turn makes this pointless.
> 
> In other words, I really think that anything that just adds a mode
> flag cannot work.
> 
> So instead of having one "modules_autoload_mode" thing, maybe the
> individual requerst_module() cases need to simply be audited.
> 
> Put another way: I think the part of your patch series that does that
> "request_module_cap()" and makes the netdev modules use it is a good
> addition.
> 
> It's the "mode" part I really don't agree with, because apparently we
> really need to default it to permissive.
> 
> So how about instead:
> 
>  - add that "request_module_cap()" and make the networking code that
> already uses CAP_ADMIN_NET use it.
> 
>  - make "request_module()" itself default to being
> "request_module_cap(CAP_SYS_MODULE,..)"
> 
>  - make sure that when the capability check fails, we print an error
> message, and then for the ones that trigger, we will audit them and
> see if it's ok.
> 
> Because that "mode" flag defaulting to off will just mean that the
> default case will remain the existing unsafe one, and that's bad.
> 
> Opt-in really doesn't work. We've done it.
> 
> Global flags for varied behavior really doesn't work. We've done that
> too. Different cases want different behavior, the global flag is just
> fundamentally broken.
> 
>               Linus
> ----------------------------------------

The only reason this is off by default is kernel rule No 1. Most of new security features are off or WARN_ON because of it otherwise bad things happen on mailinglists :) .

It seems to me that you demand an impossible job here. Don't break userspace and don't disable by default. It can't happen due to years of technical debt. I mean everyone prefers default-on but we can choose best alternative instead of maintaining status quo.

Userspace can be configured in a way which is compatible with those changes being on the same as it can be configured to work with selinux. That means on distro level or sysadmin level it's a valuable tool. It's better than nothing and it's better than using some out-of-tree patches instead. Switching one sysctl would make their life easier.

There is demand for kernel hardening stuff and default-off behavior is good compromise between breaking changes and nothing at all. The kernel has very diverse users group with different needs and capabilities so there is no one size fits all. Targeting all of them will often end in nothing useful to anyone.

Yours sincerely

G. K.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 12:16           ` Geo Kozey
  0 siblings, 0 replies; 266+ messages in thread
From: Geo Kozey @ 2017-11-28 12:16 UTC (permalink / raw)
  To: linux-security-module

> From: Linus Torvalds <torvalds@linux-foundation.org>
> Sent: Mon Nov 27 23:04:58 CET 2017
> To: Djalal Harouni <tixxdz@gmail.com>
> Cc: Kees Cook <keescook@chromium.org>, Andy Lutomirski <luto@kernel.org>, Andrew Morton <akpm@linux-foundation.org>, Luis R. Rodriguez <mcgrof@kernel.org>, James Morris <james.l.morris@oracle.com>, Ben Hutchings <ben.hutchings@codethink.co.uk>, Solar Designer <solar@openwall.com>, Serge Hallyn <serge@hallyn.com>, Jessica Yu <jeyu@kernel.org>, Rusty Russell <rusty@rustcorp.com.au>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, LSM List <linux-security-module@vger.kernel.org>, kernel-hardening at lists.openwall.com <kernel-hardening@lists.openwall.com>, Jonathan Corbet <corbet@lwn.net>, Ingo Molnar <mingo@kernel.org>, David S. Miller <davem@davemloft.net>, Network Development <netdev@vger.kernel.org>, Peter Zijlstra <peterz@infradead.org>
> Subject: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
> 
> 
> On Mon, Nov 27, 2017 at 1:41 PM, Djalal Harouni <tixxdz@gmail.com> wrote:
> >
> > However, we are trying hard to abstract some semantics that are easy
> > to grasp, we are mutating capabilities and seccomp to have an
> > abstracted "yes/no" options for our endusers.
> 
> Yes.
> 
> Sadly, it looks like we actually do have users that just expect to
> load modules dynamically without any capabilities at all.
> 
> So we can't actually disallow it by default at all, which imho makes
> this security option essentially useless.
> 
> A security option that people can't use without breaking their system
> is pointless.
> 
> We saw that with SELinux - people ended up just disabling it for
> _years_, simply because it ended up breaking so much in practice. And
> yes, it got fixed eventually, but at an incredibly high maintenance
> cost of all the crazy rules databases.
> 
> > Alright, but I guess we are stuck, is there something better on how we
> > can do this or describe this ?
> 
> So I wonder if we can perhaps look at the places that actually do
> "requerst_module()", and start filtering them on that basis.
> 
> Some of them will already have checked for capabilities.
> 
> Others clearly expect to juist work even _without_ capabilities (ie
> the bluetoothd case).
> 
> So the whole "let's add a global config option" model is broken. There
> is no possible global rule. It will break things, which in turn mean
> that people won't turn it on (and we can't turn it on by default),
> which in turn makes this pointless.
> 
> In other words, I really think that anything that just adds a mode
> flag cannot work.
> 
> So instead of having one "modules_autoload_mode" thing, maybe the
> individual requerst_module() cases need to simply be audited.
> 
> Put another way: I think the part of your patch series that does that
> "request_module_cap()" and makes the netdev modules use it is a good
> addition.
> 
> It's the "mode" part I really don't agree with, because apparently we
> really need to default it to permissive.
> 
> So how about instead:
> 
>  - add that "request_module_cap()" and make the networking code that
> already uses CAP_ADMIN_NET use it.
> 
>  - make "request_module()" itself default to being
> "request_module_cap(CAP_SYS_MODULE,..)"
> 
>  - make sure that when the capability check fails, we print an error
> message, and then for the ones that trigger, we will audit them and
> see if it's ok.
> 
> Because that "mode" flag defaulting to off will just mean that the
> default case will remain the existing unsafe one, and that's bad.
> 
> Opt-in really doesn't work. We've done it.
> 
> Global flags for varied behavior really doesn't work. We've done that
> too. Different cases want different behavior, the global flag is just
> fundamentally broken.
> 
>               Linus
> ----------------------------------------

The only reason this is off by default is kernel rule No 1. Most of new security features are off or WARN_ON because of it otherwise bad things happen on mailinglists :) .

It seems to me that you demand an impossible job here. Don't break userspace and don't disable by default. It can't happen due to years of technical debt. I mean everyone prefers default-on but we can choose best alternative instead of maintaining status quo.

Userspace can be configured in a way which is compatible with those changes being on the same as it can be configured to work with selinux. That means on distro level or sysadmin level it's a valuable tool. It's better than nothing and it's better than using some out-of-tree patches instead. Switching one sysctl would make their life easier.

There is demand for kernel hardening stuff and default-off behavior is good compromise between breaking changes and nothing at all. The kernel has very diverse users group with different needs and capabilities so there is no one size fits all. Targeting all of them will often end in nothing useful to anyone.

Yours sincerely

G. K.
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 12:16           ` Geo Kozey
  0 siblings, 0 replies; 266+ messages in thread
From: Geo Kozey @ 2017-11-28 12:16 UTC (permalink / raw)
  To: Linus Torvalds, Djalal Harouni
  Cc: Kees Cook, James Morris, Linux Kernel Mailing List, LSM List,
	kernel-hardening, Jonathan Corbet

> From: Linus Torvalds <torvalds@linux-foundation.org>
> Sent: Mon Nov 27 23:04:58 CET 2017
> To: Djalal Harouni <tixxdz@gmail.com>
> Cc: Kees Cook <keescook@chromium.org>, Andy Lutomirski <luto@kernel.org>, Andrew Morton <akpm@linux-foundation.org>, Luis R. Rodriguez <mcgrof@kernel.org>, James Morris <james.l.morris@oracle.com>, Ben Hutchings <ben.hutchings@codethink.co.uk>, Solar Designer <solar@openwall.com>, Serge Hallyn <serge@hallyn.com>, Jessica Yu <jeyu@kernel.org>, Rusty Russell <rusty@rustcorp.com.au>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, LSM List <linux-security-module@vger.kernel.org>, kernel-hardening@lists.openwall.com <kernel-hardening@lists.openwall.com>, Jonathan Corbet <corbet@lwn.net>, Ingo Molnar <mingo@kernel.org>, David S. Miller <davem@davemloft.net>, Network Development <netdev@vger.kernel.org>, Peter Zijlstra <peterz@infradead.org>
> Subject: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
> 
> 
> On Mon, Nov 27, 2017 at 1:41 PM, Djalal Harouni <tixxdz@gmail.com> wrote:
> >
> > However, we are trying hard to abstract some semantics that are easy
> > to grasp, we are mutating capabilities and seccomp to have an
> > abstracted "yes/no" options for our endusers.
> 
> Yes.
> 
> Sadly, it looks like we actually do have users that just expect to
> load modules dynamically without any capabilities at all.
> 
> So we can't actually disallow it by default at all, which imho makes
> this security option essentially useless.
> 
> A security option that people can't use without breaking their system
> is pointless.
> 
> We saw that with SELinux - people ended up just disabling it for
> _years_, simply because it ended up breaking so much in practice. And
> yes, it got fixed eventually, but at an incredibly high maintenance
> cost of all the crazy rules databases.
> 
> > Alright, but I guess we are stuck, is there something better on how we
> > can do this or describe this ?
> 
> So I wonder if we can perhaps look at the places that actually do
> "requerst_module()", and start filtering them on that basis.
> 
> Some of them will already have checked for capabilities.
> 
> Others clearly expect to juist work even _without_ capabilities (ie
> the bluetoothd case).
> 
> So the whole "let's add a global config option" model is broken. There
> is no possible global rule. It will break things, which in turn mean
> that people won't turn it on (and we can't turn it on by default),
> which in turn makes this pointless.
> 
> In other words, I really think that anything that just adds a mode
> flag cannot work.
> 
> So instead of having one "modules_autoload_mode" thing, maybe the
> individual requerst_module() cases need to simply be audited.
> 
> Put another way: I think the part of your patch series that does that
> "request_module_cap()" and makes the netdev modules use it is a good
> addition.
> 
> It's the "mode" part I really don't agree with, because apparently we
> really need to default it to permissive.
> 
> So how about instead:
> 
>  - add that "request_module_cap()" and make the networking code that
> already uses CAP_ADMIN_NET use it.
> 
>  - make "request_module()" itself default to being
> "request_module_cap(CAP_SYS_MODULE,..)"
> 
>  - make sure that when the capability check fails, we print an error
> message, and then for the ones that trigger, we will audit them and
> see if it's ok.
> 
> Because that "mode" flag defaulting to off will just mean that the
> default case will remain the existing unsafe one, and that's bad.
> 
> Opt-in really doesn't work. We've done it.
> 
> Global flags for varied behavior really doesn't work. We've done that
> too. Different cases want different behavior, the global flag is just
> fundamentally broken.
> 
>               Linus
> ----------------------------------------

The only reason this is off by default is kernel rule No 1. Most of new security features are off or WARN_ON because of it otherwise bad things happen on mailinglists :) .

It seems to me that you demand an impossible job here. Don't break userspace and don't disable by default. It can't happen due to years of technical debt. I mean everyone prefers default-on but we can choose best alternative instead of maintaining status quo.

Userspace can be configured in a way which is compatible with those changes being on the same as it can be configured to work with selinux. That means on distro level or sysadmin level it's a valuable tool. It's better than nothing and it's better than using some out-of-tree patches instead. Switching one sysctl would make their life easier.

There is demand for kernel hardening stuff and default-off behavior is good compromise between breaking changes and nothing at all. The kernel has very diverse users group with different needs and capabilities so there is no one size fits all. Targeting all of them will often end in nothing useful to anyone.

Yours sincerely

G. K.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-27 17:18   ` Djalal Harouni
  (?)
@ 2017-11-28 19:14     ` Luis R. Rodriguez
  -1 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-28 19:14 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, linux-security-module,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	netdev, Peter Zijlstra, Linus Torvalds

On Mon, Nov 27, 2017 at 06:18:34PM +0100, Djalal Harouni wrote:
...

> After a discussion with Rusty Russell [1], the suggestion was to pass
> the capability from request_module() to security_kernel_module_request()
> for 'netdev-%s' modules that need CAP_NET_ADMIN, and after review from
> Kees Cook [2] and experimenting with the code, the patch now does the
> following:
> 
> * Adds the request_module_cap() function.
> * Updates the __request_module() to take the "required_cap" argument
>         with the "prefix".

...

> Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
> ---
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index bc6addd..679d401 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -139,13 +147,22 @@ int __request_module(bool wait, const char *fmt, ...)
>  	if (!modprobe_path[0])
>  		return 0;
>  
> +	/*
> +	 * Lets attach the prefix to the module name
> +	 */
> +	if (prefix != NULL && *prefix != '\0') {
> +		len += snprintf(module_name, MODULE_NAME_LEN, "%s-", prefix);
> +		if (len >= MODULE_NAME_LEN)
> +			return -ENAMETOOLONG;
> +	}
> +
>  	va_start(args, fmt);
> -	ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
> +	ret = vsnprintf(module_name + len, MODULE_NAME_LEN - len, fmt, args);
>  	va_end(args);
> -	if (ret >= MODULE_NAME_LEN)
> +	if (ret >= MODULE_NAME_LEN - len)
>  		return -ENAMETOOLONG;
>  
> -	ret = security_kernel_module_request(module_name);
> +	ret = security_kernel_module_request(module_name, required_cap, prefix);
>  	if (ret)
>  		return ret;
>  

kmod is just a helper to poke userpsace to load a module, that's it.

The old init_module() and newer finit_module() do the real handy work or
module loading, and both currently only use may_init_module():

static int may_init_module(void)
{
	if (!capable(CAP_SYS_MODULE) || modules_disabled)
		return -EPERM;                                                  

	return 0;
}

This begs the question:

  o If userspace just tries to just use raw finit_module() do we want similar
    checks?

Otherwise, correct me if I'm wrong this all seems pointless.

If we want something similar I think we might need to be processing aliases and
check for the aliases for their desired restrictions on finit_module(),
otherwise userspace can skip through the checks if the module name does not
match the alias prefix.

To be clear, aliases are completely ignored today on load_module(), so loading
'xfs' with finit_module() will just have the kernel know about 'xfs' not
'fs-xfs'.

So we currently do not process aliases in kernel.

I have debugging patches to enable us to process them, but they are just for
debugging and I've been meaning to send them in for review. I designed them
only for debugging given last time someone suggested for aliases processing to
be added, the only use case we found was a pre-optimizations we decided to avoid
pursuing. Debugging is a good reason to have alias processing in-kernel though.

The pre-optimization we decided to stay away from was to check if the requested
module via request_module() was already loaded *and* also check if the name passed
matches any of the existing module aliases for currently loaded modules. Today
request_module() does not even check if a requested module is already loaded,
its a stupid loader, it just goes to userspace, and lets userspace figure it
out. Userspace in turn could check for aliases, but it could lie, or not be up
to date to do that.

The pre-optmization is a theoretical gain only then, and if userspace had
proper alias checking it is arguable that it may perform just as equal.
To help valuate these sorts of things we now have:

tools/testing/selftests/kmod/kmod.sh

So further patches can use and test impact with it.

Anyway -- so aliasing is currently only a debugging consideration, but without
processing aliases, all this work seems pointless to me as the real loader is
in finit_module().
 
kmod is just a stupid loader and uses the kernel usermode helper to do work.

  Luis

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 19:14     ` Luis R. Rodriguez
  0 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-28 19:14 UTC (permalink / raw)
  To: linux-security-module

On Mon, Nov 27, 2017 at 06:18:34PM +0100, Djalal Harouni wrote:
...

> After a discussion with Rusty Russell [1], the suggestion was to pass
> the capability from request_module() to security_kernel_module_request()
> for 'netdev-%s' modules that need CAP_NET_ADMIN, and after review from
> Kees Cook [2] and experimenting with the code, the patch now does the
> following:
> 
> * Adds the request_module_cap() function.
> * Updates the __request_module() to take the "required_cap" argument
>         with the "prefix".

...

> Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
> ---
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index bc6addd..679d401 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -139,13 +147,22 @@ int __request_module(bool wait, const char *fmt, ...)
>  	if (!modprobe_path[0])
>  		return 0;
>  
> +	/*
> +	 * Lets attach the prefix to the module name
> +	 */
> +	if (prefix != NULL && *prefix != '\0') {
> +		len += snprintf(module_name, MODULE_NAME_LEN, "%s-", prefix);
> +		if (len >= MODULE_NAME_LEN)
> +			return -ENAMETOOLONG;
> +	}
> +
>  	va_start(args, fmt);
> -	ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
> +	ret = vsnprintf(module_name + len, MODULE_NAME_LEN - len, fmt, args);
>  	va_end(args);
> -	if (ret >= MODULE_NAME_LEN)
> +	if (ret >= MODULE_NAME_LEN - len)
>  		return -ENAMETOOLONG;
>  
> -	ret = security_kernel_module_request(module_name);
> +	ret = security_kernel_module_request(module_name, required_cap, prefix);
>  	if (ret)
>  		return ret;
>  

kmod is just a helper to poke userpsace to load a module, that's it.

The old init_module() and newer finit_module() do the real handy work or
module loading, and both currently only use may_init_module():

static int may_init_module(void)
{
	if (!capable(CAP_SYS_MODULE) || modules_disabled)
		return -EPERM;                                                  

	return 0;
}

This begs the question:

  o If userspace just tries to just use raw finit_module() do we want similar
    checks?

Otherwise, correct me if I'm wrong this all seems pointless.

If we want something similar I think we might need to be processing aliases and
check for the aliases for their desired restrictions on finit_module(),
otherwise userspace can skip through the checks if the module name does not
match the alias prefix.

To be clear, aliases are completely ignored today on load_module(), so loading
'xfs' with finit_module() will just have the kernel know about 'xfs' not
'fs-xfs'.

So we currently do not process aliases in kernel.

I have debugging patches to enable us to process them, but they are just for
debugging and I've been meaning to send them in for review. I designed them
only for debugging given last time someone suggested for aliases processing to
be added, the only use case we found was a pre-optimizations we decided to avoid
pursuing. Debugging is a good reason to have alias processing in-kernel though.

The pre-optimization we decided to stay away from was to check if the requested
module via request_module() was already loaded *and* also check if the name passed
matches any of the existing module aliases for currently loaded modules. Today
request_module() does not even check if a requested module is already loaded,
its a stupid loader, it just goes to userspace, and lets userspace figure it
out. Userspace in turn could check for aliases, but it could lie, or not be up
to date to do that.

The pre-optmization is a theoretical gain only then, and if userspace had
proper alias checking it is arguable that it may perform just as equal.
To help valuate these sorts of things we now have:

tools/testing/selftests/kmod/kmod.sh

So further patches can use and test impact with it.

Anyway -- so aliasing is currently only a debugging consideration, but without
processing aliases, all this work seems pointless to me as the real loader is
in finit_module().
 
kmod is just a stupid loader and uses the kernel usermode helper to do work.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 19:14     ` Luis R. Rodriguez
  0 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-28 19:14 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, linux-security-module,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	netdev, Peter Zijlstra, Linus Torvalds

On Mon, Nov 27, 2017 at 06:18:34PM +0100, Djalal Harouni wrote:
...

> After a discussion with Rusty Russell [1], the suggestion was to pass
> the capability from request_module() to security_kernel_module_request()
> for 'netdev-%s' modules that need CAP_NET_ADMIN, and after review from
> Kees Cook [2] and experimenting with the code, the patch now does the
> following:
> 
> * Adds the request_module_cap() function.
> * Updates the __request_module() to take the "required_cap" argument
>         with the "prefix".

...

> Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
> ---
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index bc6addd..679d401 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -139,13 +147,22 @@ int __request_module(bool wait, const char *fmt, ...)
>  	if (!modprobe_path[0])
>  		return 0;
>  
> +	/*
> +	 * Lets attach the prefix to the module name
> +	 */
> +	if (prefix != NULL && *prefix != '\0') {
> +		len += snprintf(module_name, MODULE_NAME_LEN, "%s-", prefix);
> +		if (len >= MODULE_NAME_LEN)
> +			return -ENAMETOOLONG;
> +	}
> +
>  	va_start(args, fmt);
> -	ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
> +	ret = vsnprintf(module_name + len, MODULE_NAME_LEN - len, fmt, args);
>  	va_end(args);
> -	if (ret >= MODULE_NAME_LEN)
> +	if (ret >= MODULE_NAME_LEN - len)
>  		return -ENAMETOOLONG;
>  
> -	ret = security_kernel_module_request(module_name);
> +	ret = security_kernel_module_request(module_name, required_cap, prefix);
>  	if (ret)
>  		return ret;
>  

kmod is just a helper to poke userpsace to load a module, that's it.

The old init_module() and newer finit_module() do the real handy work or
module loading, and both currently only use may_init_module():

static int may_init_module(void)
{
	if (!capable(CAP_SYS_MODULE) || modules_disabled)
		return -EPERM;                                                  

	return 0;
}

This begs the question:

  o If userspace just tries to just use raw finit_module() do we want similar
    checks?

Otherwise, correct me if I'm wrong this all seems pointless.

If we want something similar I think we might need to be processing aliases and
check for the aliases for their desired restrictions on finit_module(),
otherwise userspace can skip through the checks if the module name does not
match the alias prefix.

To be clear, aliases are completely ignored today on load_module(), so loading
'xfs' with finit_module() will just have the kernel know about 'xfs' not
'fs-xfs'.

So we currently do not process aliases in kernel.

I have debugging patches to enable us to process them, but they are just for
debugging and I've been meaning to send them in for review. I designed them
only for debugging given last time someone suggested for aliases processing to
be added, the only use case we found was a pre-optimizations we decided to avoid
pursuing. Debugging is a good reason to have alias processing in-kernel though.

The pre-optimization we decided to stay away from was to check if the requested
module via request_module() was already loaded *and* also check if the name passed
matches any of the existing module aliases for currently loaded modules. Today
request_module() does not even check if a requested module is already loaded,
its a stupid loader, it just goes to userspace, and lets userspace figure it
out. Userspace in turn could check for aliases, but it could lie, or not be up
to date to do that.

The pre-optmization is a theoretical gain only then, and if userspace had
proper alias checking it is arguable that it may perform just as equal.
To help valuate these sorts of things we now have:

tools/testing/selftests/kmod/kmod.sh

So further patches can use and test impact with it.

Anyway -- so aliasing is currently only a debugging consideration, but without
processing aliases, all this work seems pointless to me as the real loader is
in finit_module().
 
kmod is just a stupid loader and uses the kernel usermode helper to do work.

  Luis

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-28 12:16           ` Geo Kozey
  (?)
@ 2017-11-28 19:32             ` Theodore Ts'o
  -1 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-28 19:32 UTC (permalink / raw)
  To: Geo Kozey
  Cc: Linus Torvalds, Djalal Harouni, Kees Cook, James Morris,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet

On Tue, Nov 28, 2017 at 01:16:59PM +0100, Geo Kozey wrote:
> 
> Userspace can be configured in a way which is compatible with those
> changes being on the same as it can be configured to work with
> selinux. That means on distro level or sysadmin level it's a
> valuable tool. It's better than nothing and it's better than using
> some out-of-tree patches instead. Switching one sysctl would make
> their life easier.

If *selinux* can opt-in to something more stringent, such that when
you upgrade to a new version of selinux which enables something which
breaks all modules unless you set up the rules corretly, I don't see a
problem with it.  It might force distributions not to go to the latest
version of SELinux because users get cranky when their systems get
broken, but for people like me, who *still* don't use SELinux because
every few years, i try to enable on my development laptop running
Debian, watch ***far*** too much stuff break. and then turn it off
again.  So tieing it to SELinux (as far as I am concerned) reduces it to
a previously unsolved problem.  :-)

But that's different from opting it on by default for non-SELinux
users.  To which I can only say, "Please, No."

	   	       	    	 	  - Ted

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 19:32             ` Theodore Ts'o
  0 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-28 19:32 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 01:16:59PM +0100, Geo Kozey wrote:
> 
> Userspace can be configured in a way which is compatible with those
> changes being on the same as it can be configured to work with
> selinux. That means on distro level or sysadmin level it's a
> valuable tool. It's better than nothing and it's better than using
> some out-of-tree patches instead. Switching one sysctl would make
> their life easier.

If *selinux* can opt-in to something more stringent, such that when
you upgrade to a new version of selinux which enables something which
breaks all modules unless you set up the rules corretly, I don't see a
problem with it.  It might force distributions not to go to the latest
version of SELinux because users get cranky when their systems get
broken, but for people like me, who *still* don't use SELinux because
every few years, i try to enable on my development laptop running
Debian, watch ***far*** too much stuff break. and then turn it off
again.  So tieing it to SELinux (as far as I am concerned) reduces it to
a previously unsolved problem.  :-)

But that's different from opting it on by default for non-SELinux
users.  To which I can only say, "Please, No."

	   	       	    	 	  - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 19:32             ` Theodore Ts'o
  0 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-28 19:32 UTC (permalink / raw)
  To: Geo Kozey
  Cc: Linus Torvalds, Djalal Harouni, Kees Cook, James Morris,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet

On Tue, Nov 28, 2017 at 01:16:59PM +0100, Geo Kozey wrote:
> 
> Userspace can be configured in a way which is compatible with those
> changes being on the same as it can be configured to work with
> selinux. That means on distro level or sysadmin level it's a
> valuable tool. It's better than nothing and it's better than using
> some out-of-tree patches instead. Switching one sysctl would make
> their life easier.

If *selinux* can opt-in to something more stringent, such that when
you upgrade to a new version of selinux which enables something which
breaks all modules unless you set up the rules corretly, I don't see a
problem with it.  It might force distributions not to go to the latest
version of SELinux because users get cranky when their systems get
broken, but for people like me, who *still* don't use SELinux because
every few years, i try to enable on my development laptop running
Debian, watch ***far*** too much stuff break. and then turn it off
again.  So tieing it to SELinux (as far as I am concerned) reduces it to
a previously unsolved problem.  :-)

But that's different from opting it on by default for non-SELinux
users.  To which I can only say, "Please, No."

	   	       	    	 	  - Ted

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-28 19:32             ` Theodore Ts'o
  (?)
@ 2017-11-28 20:08               ` Kees Cook
  -1 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 20:08 UTC (permalink / raw)
  To: Theodore Ts'o, Linus Torvalds, Djalal Harouni
  Cc: Jonathan Corbet, James Morris, LSM List,
	Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 11:32 AM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Tue, Nov 28, 2017 at 01:16:59PM +0100, Geo Kozey wrote:
>>
>> Userspace can be configured in a way which is compatible with those
>> changes being on the same as it can be configured to work with
>> selinux. That means on distro level or sysadmin level it's a
>> valuable tool. It's better than nothing and it's better than using
>> some out-of-tree patches instead. Switching one sysctl would make
>> their life easier.
>
> If *selinux* can opt-in to something more stringent, such that when
> you upgrade to a new version of selinux which enables something which
> breaks all modules unless you set up the rules corretly, I don't see a
> problem with it.  It might force distributions not to go to the latest
> version of SELinux because users get cranky when their systems get
> broken, but for people like me, who *still* don't use SELinux because
> every few years, i try to enable on my development laptop running
> Debian, watch ***far*** too much stuff break. and then turn it off
> again.  So tieing it to SELinux (as far as I am concerned) reduces it to
> a previously unsolved problem.  :-)
>
> But that's different from opting it on by default for non-SELinux
> users.  To which I can only say, "Please, No."

I don't want to see this tied to SELinux because it narrows the
audience, and SELinux still hasn't solved their issues in containers.
I think the per-task setting is sufficient.

Linus, are you okay with this series if the global sysctl gets dropped?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 20:08               ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 20:08 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 11:32 AM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Tue, Nov 28, 2017 at 01:16:59PM +0100, Geo Kozey wrote:
>>
>> Userspace can be configured in a way which is compatible with those
>> changes being on the same as it can be configured to work with
>> selinux. That means on distro level or sysadmin level it's a
>> valuable tool. It's better than nothing and it's better than using
>> some out-of-tree patches instead. Switching one sysctl would make
>> their life easier.
>
> If *selinux* can opt-in to something more stringent, such that when
> you upgrade to a new version of selinux which enables something which
> breaks all modules unless you set up the rules corretly, I don't see a
> problem with it.  It might force distributions not to go to the latest
> version of SELinux because users get cranky when their systems get
> broken, but for people like me, who *still* don't use SELinux because
> every few years, i try to enable on my development laptop running
> Debian, watch ***far*** too much stuff break. and then turn it off
> again.  So tieing it to SELinux (as far as I am concerned) reduces it to
> a previously unsolved problem.  :-)
>
> But that's different from opting it on by default for non-SELinux
> users.  To which I can only say, "Please, No."

I don't want to see this tied to SELinux because it narrows the
audience, and SELinux still hasn't solved their issues in containers.
I think the per-task setting is sufficient.

Linus, are you okay with this series if the global sysctl gets dropped?

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 20:08               ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 20:08 UTC (permalink / raw)
  To: Theodore Ts'o, Linus Torvalds, Djalal Harouni
  Cc: Jonathan Corbet, James Morris, LSM List,
	Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 11:32 AM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Tue, Nov 28, 2017 at 01:16:59PM +0100, Geo Kozey wrote:
>>
>> Userspace can be configured in a way which is compatible with those
>> changes being on the same as it can be configured to work with
>> selinux. That means on distro level or sysadmin level it's a
>> valuable tool. It's better than nothing and it's better than using
>> some out-of-tree patches instead. Switching one sysctl would make
>> their life easier.
>
> If *selinux* can opt-in to something more stringent, such that when
> you upgrade to a new version of selinux which enables something which
> breaks all modules unless you set up the rules corretly, I don't see a
> problem with it.  It might force distributions not to go to the latest
> version of SELinux because users get cranky when their systems get
> broken, but for people like me, who *still* don't use SELinux because
> every few years, i try to enable on my development laptop running
> Debian, watch ***far*** too much stuff break. and then turn it off
> again.  So tieing it to SELinux (as far as I am concerned) reduces it to
> a previously unsolved problem.  :-)
>
> But that's different from opting it on by default for non-SELinux
> users.  To which I can only say, "Please, No."

I don't want to see this tied to SELinux because it narrows the
audience, and SELinux still hasn't solved their issues in containers.
I think the per-task setting is sufficient.

Linus, are you okay with this series if the global sysctl gets dropped?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-28 19:14     ` Luis R. Rodriguez
  (?)
@ 2017-11-28 20:11       ` Kees Cook
  -1 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 20:11 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton, James Morris,
	Ben Hutchings, Solar Designer, Serge Hallyn, Jessica Yu,
	Rusty Russell, LKML, linux-security-module, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 11:14 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> kmod is just a helper to poke userpsace to load a module, that's it.
>
> The old init_module() and newer finit_module() do the real handy work or
> module loading, and both currently only use may_init_module():
>
> static int may_init_module(void)
> {
>         if (!capable(CAP_SYS_MODULE) || modules_disabled)
>                 return -EPERM;
>
>         return 0;
> }
>
> This begs the question:
>
>   o If userspace just tries to just use raw finit_module() do we want similar
>     checks?
>
> Otherwise, correct me if I'm wrong this all seems pointless.

Hm? That's direct-loading, not auto-loading. This series is only about
auto-loading.

We already have a global sysctl for blocking direct-loading (modules_disabled).

> If we want something similar I think we might need to be processing aliases and
> check for the aliases for their desired restrictions on finit_module(),

We don't need to handle aliases.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 20:11       ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 20:11 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 11:14 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> kmod is just a helper to poke userpsace to load a module, that's it.
>
> The old init_module() and newer finit_module() do the real handy work or
> module loading, and both currently only use may_init_module():
>
> static int may_init_module(void)
> {
>         if (!capable(CAP_SYS_MODULE) || modules_disabled)
>                 return -EPERM;
>
>         return 0;
> }
>
> This begs the question:
>
>   o If userspace just tries to just use raw finit_module() do we want similar
>     checks?
>
> Otherwise, correct me if I'm wrong this all seems pointless.

Hm? That's direct-loading, not auto-loading. This series is only about
auto-loading.

We already have a global sysctl for blocking direct-loading (modules_disabled).

> If we want something similar I think we might need to be processing aliases and
> check for the aliases for their desired restrictions on finit_module(),

We don't need to handle aliases.

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 20:11       ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 20:11 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton, James Morris,
	Ben Hutchings, Solar Designer, Serge Hallyn, Jessica Yu,
	Rusty Russell, LKML, linux-security-module, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 11:14 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> kmod is just a helper to poke userpsace to load a module, that's it.
>
> The old init_module() and newer finit_module() do the real handy work or
> module loading, and both currently only use may_init_module():
>
> static int may_init_module(void)
> {
>         if (!capable(CAP_SYS_MODULE) || modules_disabled)
>                 return -EPERM;
>
>         return 0;
> }
>
> This begs the question:
>
>   o If userspace just tries to just use raw finit_module() do we want similar
>     checks?
>
> Otherwise, correct me if I'm wrong this all seems pointless.

Hm? That's direct-loading, not auto-loading. This series is only about
auto-loading.

We already have a global sysctl for blocking direct-loading (modules_disabled).

> If we want something similar I think we might need to be processing aliases and
> check for the aliases for their desired restrictions on finit_module(),

We don't need to handle aliases.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-28 20:08               ` Kees Cook
  (?)
@ 2017-11-28 20:12                 ` Linus Torvalds
  -1 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-28 20:12 UTC (permalink / raw)
  To: Kees Cook
  Cc: Theodore Ts'o, Djalal Harouni, Jonathan Corbet, James Morris,
	LSM List, Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 12:08 PM, Kees Cook <keescook@chromium.org> wrote:
>
> Linus, are you okay with this series if the global sysctl gets dropped?

So really, it's not the "global sysctl" as much as the "global
request_module()" that annoys me.

I'll happily take the request_module_cap() part and the thing that
makes networking use that.

But the flag that we have to default to off because it breaks every
single box otherwise? No. It doesn't  matter if it's one single global
or just a "global behavior for request_module() for this process" at
that point, it's still a pointless security flag that is opt-in.

             Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 20:12                 ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-28 20:12 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 12:08 PM, Kees Cook <keescook@chromium.org> wrote:
>
> Linus, are you okay with this series if the global sysctl gets dropped?

So really, it's not the "global sysctl" as much as the "global
request_module()" that annoys me.

I'll happily take the request_module_cap() part and the thing that
makes networking use that.

But the flag that we have to default to off because it breaks every
single box otherwise? No. It doesn't  matter if it's one single global
or just a "global behavior for request_module() for this process" at
that point, it's still a pointless security flag that is opt-in.

             Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 20:12                 ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-28 20:12 UTC (permalink / raw)
  To: Kees Cook
  Cc: Theodore Ts'o, Djalal Harouni, Jonathan Corbet, James Morris,
	LSM List, Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 12:08 PM, Kees Cook <keescook@chromium.org> wrote:
>
> Linus, are you okay with this series if the global sysctl gets dropped?

So really, it's not the "global sysctl" as much as the "global
request_module()" that annoys me.

I'll happily take the request_module_cap() part and the thing that
makes networking use that.

But the flag that we have to default to off because it breaks every
single box otherwise? No. It doesn't  matter if it's one single global
or just a "global behavior for request_module() for this process" at
that point, it's still a pointless security flag that is opt-in.

             Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-28 19:14     ` Luis R. Rodriguez
  (?)
@ 2017-11-28 20:18       ` Djalal Harouni
  -1 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-28 20:18 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, James Morris,
	Ben Hutchings, Solar Designer, Serge Hallyn, Jessica Yu,
	Rusty Russell, linux-kernel, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

Hi Luis,

On Tue, Nov 28, 2017 at 8:14 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Mon, Nov 27, 2017 at 06:18:34PM +0100, Djalal Harouni wrote:
> ...
>
>> After a discussion with Rusty Russell [1], the suggestion was to pass
>> the capability from request_module() to security_kernel_module_request()
>> for 'netdev-%s' modules that need CAP_NET_ADMIN, and after review from
>> Kees Cook [2] and experimenting with the code, the patch now does the
>> following:
>>
>> * Adds the request_module_cap() function.
>> * Updates the __request_module() to take the "required_cap" argument
>>         with the "prefix".
>
> ...
>
>> Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
>> ---
>> diff --git a/kernel/kmod.c b/kernel/kmod.c
>> index bc6addd..679d401 100644
>> --- a/kernel/kmod.c
>> +++ b/kernel/kmod.c
>> @@ -139,13 +147,22 @@ int __request_module(bool wait, const char *fmt, ...)
>>       if (!modprobe_path[0])
>>               return 0;
>>
>> +     /*
>> +      * Lets attach the prefix to the module name
>> +      */
>> +     if (prefix != NULL && *prefix != '\0') {
>> +             len += snprintf(module_name, MODULE_NAME_LEN, "%s-", prefix);
>> +             if (len >= MODULE_NAME_LEN)
>> +                     return -ENAMETOOLONG;
>> +     }
>> +
>>       va_start(args, fmt);
>> -     ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
>> +     ret = vsnprintf(module_name + len, MODULE_NAME_LEN - len, fmt, args);
>>       va_end(args);
>> -     if (ret >= MODULE_NAME_LEN)
>> +     if (ret >= MODULE_NAME_LEN - len)
>>               return -ENAMETOOLONG;
>>
>> -     ret = security_kernel_module_request(module_name);
>> +     ret = security_kernel_module_request(module_name, required_cap, prefix);
>>       if (ret)
>>               return ret;
>>
>
> kmod is just a helper to poke userpsace to load a module, that's it.
>
> The old init_module() and newer finit_module() do the real handy work or
> module loading, and both currently only use may_init_module():
>
> static int may_init_module(void)
> {
>         if (!capable(CAP_SYS_MODULE) || modules_disabled)
>                 return -EPERM;
>
>         return 0;
> }
>
> This begs the question:
>
>   o If userspace just tries to just use raw finit_module() do we want similar
>     checks?
>
> Otherwise, correct me if I'm wrong this all seems pointless.
>
> If we want something similar I think we might need to be processing aliases and
> check for the aliases for their desired restrictions on finit_module(),
> otherwise userspace can skip through the checks if the module name does not
> match the alias prefix.
>
> To be clear, aliases are completely ignored today on load_module(), so loading
> 'xfs' with finit_module() will just have the kernel know about 'xfs' not
> 'fs-xfs'.
>
> So we currently do not process aliases in kernel.
>
> I have debugging patches to enable us to process them, but they are just for
> debugging and I've been meaning to send them in for review. I designed them
> only for debugging given last time someone suggested for aliases processing to
> be added, the only use case we found was a pre-optimizations we decided to avoid
> pursuing. Debugging is a good reason to have alias processing in-kernel though.
>
> The pre-optimization we decided to stay away from was to check if the requested
> module via request_module() was already loaded *and* also check if the name passed
> matches any of the existing module aliases for currently loaded modules. Today
> request_module() does not even check if a requested module is already loaded,
> its a stupid loader, it just goes to userspace, and lets userspace figure it
> out. Userspace in turn could check for aliases, but it could lie, or not be up
> to date to do that.
>
> The pre-optmization is a theoretical gain only then, and if userspace had
> proper alias checking it is arguable that it may perform just as equal.
> To help valuate these sorts of things we now have:
>
> tools/testing/selftests/kmod/kmod.sh
>
> So further patches can use and test impact with it.
>
> Anyway -- so aliasing is currently only a debugging consideration, but without
> processing aliases, all this work seems pointless to me as the real loader is
> in finit_module().

These patchset are about module auto-loading which is triggered from
multiple paths in the kernel, the cover letter notes all the
differences between the two operations and why the explicit one and
"modules_disabled=1" is already a pain.

The finit_module() is covered directly by CAP_SYS_MODULE, and for
aliasing I am not sure how it will be related or how userspace will
maintain it, we do not have a use case for it, we want a simple flag.

Thank you!


-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 20:18       ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-28 20:18 UTC (permalink / raw)
  To: linux-security-module

Hi Luis,

On Tue, Nov 28, 2017 at 8:14 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Mon, Nov 27, 2017 at 06:18:34PM +0100, Djalal Harouni wrote:
> ...
>
>> After a discussion with Rusty Russell [1], the suggestion was to pass
>> the capability from request_module() to security_kernel_module_request()
>> for 'netdev-%s' modules that need CAP_NET_ADMIN, and after review from
>> Kees Cook [2] and experimenting with the code, the patch now does the
>> following:
>>
>> * Adds the request_module_cap() function.
>> * Updates the __request_module() to take the "required_cap" argument
>>         with the "prefix".
>
> ...
>
>> Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
>> ---
>> diff --git a/kernel/kmod.c b/kernel/kmod.c
>> index bc6addd..679d401 100644
>> --- a/kernel/kmod.c
>> +++ b/kernel/kmod.c
>> @@ -139,13 +147,22 @@ int __request_module(bool wait, const char *fmt, ...)
>>       if (!modprobe_path[0])
>>               return 0;
>>
>> +     /*
>> +      * Lets attach the prefix to the module name
>> +      */
>> +     if (prefix != NULL && *prefix != '\0') {
>> +             len += snprintf(module_name, MODULE_NAME_LEN, "%s-", prefix);
>> +             if (len >= MODULE_NAME_LEN)
>> +                     return -ENAMETOOLONG;
>> +     }
>> +
>>       va_start(args, fmt);
>> -     ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
>> +     ret = vsnprintf(module_name + len, MODULE_NAME_LEN - len, fmt, args);
>>       va_end(args);
>> -     if (ret >= MODULE_NAME_LEN)
>> +     if (ret >= MODULE_NAME_LEN - len)
>>               return -ENAMETOOLONG;
>>
>> -     ret = security_kernel_module_request(module_name);
>> +     ret = security_kernel_module_request(module_name, required_cap, prefix);
>>       if (ret)
>>               return ret;
>>
>
> kmod is just a helper to poke userpsace to load a module, that's it.
>
> The old init_module() and newer finit_module() do the real handy work or
> module loading, and both currently only use may_init_module():
>
> static int may_init_module(void)
> {
>         if (!capable(CAP_SYS_MODULE) || modules_disabled)
>                 return -EPERM;
>
>         return 0;
> }
>
> This begs the question:
>
>   o If userspace just tries to just use raw finit_module() do we want similar
>     checks?
>
> Otherwise, correct me if I'm wrong this all seems pointless.
>
> If we want something similar I think we might need to be processing aliases and
> check for the aliases for their desired restrictions on finit_module(),
> otherwise userspace can skip through the checks if the module name does not
> match the alias prefix.
>
> To be clear, aliases are completely ignored today on load_module(), so loading
> 'xfs' with finit_module() will just have the kernel know about 'xfs' not
> 'fs-xfs'.
>
> So we currently do not process aliases in kernel.
>
> I have debugging patches to enable us to process them, but they are just for
> debugging and I've been meaning to send them in for review. I designed them
> only for debugging given last time someone suggested for aliases processing to
> be added, the only use case we found was a pre-optimizations we decided to avoid
> pursuing. Debugging is a good reason to have alias processing in-kernel though.
>
> The pre-optimization we decided to stay away from was to check if the requested
> module via request_module() was already loaded *and* also check if the name passed
> matches any of the existing module aliases for currently loaded modules. Today
> request_module() does not even check if a requested module is already loaded,
> its a stupid loader, it just goes to userspace, and lets userspace figure it
> out. Userspace in turn could check for aliases, but it could lie, or not be up
> to date to do that.
>
> The pre-optmization is a theoretical gain only then, and if userspace had
> proper alias checking it is arguable that it may perform just as equal.
> To help valuate these sorts of things we now have:
>
> tools/testing/selftests/kmod/kmod.sh
>
> So further patches can use and test impact with it.
>
> Anyway -- so aliasing is currently only a debugging consideration, but without
> processing aliases, all this work seems pointless to me as the real loader is
> in finit_module().

These patchset are about module auto-loading which is triggered from
multiple paths in the kernel, the cover letter notes all the
differences between the two operations and why the explicit one and
"modules_disabled=1" is already a pain.

The finit_module() is covered directly by CAP_SYS_MODULE, and for
aliasing I am not sure how it will be related or how userspace will
maintain it, we do not have a use case for it, we want a simple flag.

Thank you!


-- 
tixxdz
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 20:18       ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-28 20:18 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, James Morris,
	Ben Hutchings, Solar Designer, Serge Hallyn, Jessica Yu,
	Rusty Russell, linux-kernel, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

Hi Luis,

On Tue, Nov 28, 2017 at 8:14 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Mon, Nov 27, 2017 at 06:18:34PM +0100, Djalal Harouni wrote:
> ...
>
>> After a discussion with Rusty Russell [1], the suggestion was to pass
>> the capability from request_module() to security_kernel_module_request()
>> for 'netdev-%s' modules that need CAP_NET_ADMIN, and after review from
>> Kees Cook [2] and experimenting with the code, the patch now does the
>> following:
>>
>> * Adds the request_module_cap() function.
>> * Updates the __request_module() to take the "required_cap" argument
>>         with the "prefix".
>
> ...
>
>> Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
>> ---
>> diff --git a/kernel/kmod.c b/kernel/kmod.c
>> index bc6addd..679d401 100644
>> --- a/kernel/kmod.c
>> +++ b/kernel/kmod.c
>> @@ -139,13 +147,22 @@ int __request_module(bool wait, const char *fmt, ...)
>>       if (!modprobe_path[0])
>>               return 0;
>>
>> +     /*
>> +      * Lets attach the prefix to the module name
>> +      */
>> +     if (prefix != NULL && *prefix != '\0') {
>> +             len += snprintf(module_name, MODULE_NAME_LEN, "%s-", prefix);
>> +             if (len >= MODULE_NAME_LEN)
>> +                     return -ENAMETOOLONG;
>> +     }
>> +
>>       va_start(args, fmt);
>> -     ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
>> +     ret = vsnprintf(module_name + len, MODULE_NAME_LEN - len, fmt, args);
>>       va_end(args);
>> -     if (ret >= MODULE_NAME_LEN)
>> +     if (ret >= MODULE_NAME_LEN - len)
>>               return -ENAMETOOLONG;
>>
>> -     ret = security_kernel_module_request(module_name);
>> +     ret = security_kernel_module_request(module_name, required_cap, prefix);
>>       if (ret)
>>               return ret;
>>
>
> kmod is just a helper to poke userpsace to load a module, that's it.
>
> The old init_module() and newer finit_module() do the real handy work or
> module loading, and both currently only use may_init_module():
>
> static int may_init_module(void)
> {
>         if (!capable(CAP_SYS_MODULE) || modules_disabled)
>                 return -EPERM;
>
>         return 0;
> }
>
> This begs the question:
>
>   o If userspace just tries to just use raw finit_module() do we want similar
>     checks?
>
> Otherwise, correct me if I'm wrong this all seems pointless.
>
> If we want something similar I think we might need to be processing aliases and
> check for the aliases for their desired restrictions on finit_module(),
> otherwise userspace can skip through the checks if the module name does not
> match the alias prefix.
>
> To be clear, aliases are completely ignored today on load_module(), so loading
> 'xfs' with finit_module() will just have the kernel know about 'xfs' not
> 'fs-xfs'.
>
> So we currently do not process aliases in kernel.
>
> I have debugging patches to enable us to process them, but they are just for
> debugging and I've been meaning to send them in for review. I designed them
> only for debugging given last time someone suggested for aliases processing to
> be added, the only use case we found was a pre-optimizations we decided to avoid
> pursuing. Debugging is a good reason to have alias processing in-kernel though.
>
> The pre-optimization we decided to stay away from was to check if the requested
> module via request_module() was already loaded *and* also check if the name passed
> matches any of the existing module aliases for currently loaded modules. Today
> request_module() does not even check if a requested module is already loaded,
> its a stupid loader, it just goes to userspace, and lets userspace figure it
> out. Userspace in turn could check for aliases, but it could lie, or not be up
> to date to do that.
>
> The pre-optmization is a theoretical gain only then, and if userspace had
> proper alias checking it is arguable that it may perform just as equal.
> To help valuate these sorts of things we now have:
>
> tools/testing/selftests/kmod/kmod.sh
>
> So further patches can use and test impact with it.
>
> Anyway -- so aliasing is currently only a debugging consideration, but without
> processing aliases, all this work seems pointless to me as the real loader is
> in finit_module().

These patchset are about module auto-loading which is triggered from
multiple paths in the kernel, the cover letter notes all the
differences between the two operations and why the explicit one and
"modules_disabled=1" is already a pain.

The finit_module() is covered directly by CAP_SYS_MODULE, and for
aliasing I am not sure how it will be related or how userspace will
maintain it, we do not have a use case for it, we want a simple flag.

Thank you!


-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-28 20:12                 ` Linus Torvalds
  (?)
@ 2017-11-28 20:20                   ` Kees Cook
  -1 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 20:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Theodore Ts'o, Djalal Harouni, Jonathan Corbet, James Morris,
	LSM List, Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 12:12 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Nov 28, 2017 at 12:08 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>> Linus, are you okay with this series if the global sysctl gets dropped?
>
> So really, it's not the "global sysctl" as much as the "global
> request_module()" that annoys me.
>
> I'll happily take the request_module_cap() part and the thing that
> makes networking use that.
>
> But the flag that we have to default to off because it breaks every
> single box otherwise? No. It doesn't  matter if it's one single global
> or just a "global behavior for request_module() for this process" at
> that point, it's still a pointless security flag that is opt-in.

To be clear: such a flag wouldn't doesn't break every system, but I
understand your concern.

So what's the right path forward for allowing a way to block
autoloading? Separate existing request_module() calls into "must be
privileged" and "can be unpriv" first, then rework the series to deal
with the "unpriv okay" subset?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 20:20                   ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 20:20 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 12:12 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Nov 28, 2017 at 12:08 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>> Linus, are you okay with this series if the global sysctl gets dropped?
>
> So really, it's not the "global sysctl" as much as the "global
> request_module()" that annoys me.
>
> I'll happily take the request_module_cap() part and the thing that
> makes networking use that.
>
> But the flag that we have to default to off because it breaks every
> single box otherwise? No. It doesn't  matter if it's one single global
> or just a "global behavior for request_module() for this process" at
> that point, it's still a pointless security flag that is opt-in.

To be clear: such a flag wouldn't doesn't break every system, but I
understand your concern.

So what's the right path forward for allowing a way to block
autoloading? Separate existing request_module() calls into "must be
privileged" and "can be unpriv" first, then rework the series to deal
with the "unpriv okay" subset?

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 20:20                   ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 20:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Theodore Ts'o, Djalal Harouni, Jonathan Corbet, James Morris,
	LSM List, Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 12:12 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Nov 28, 2017 at 12:08 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>> Linus, are you okay with this series if the global sysctl gets dropped?
>
> So really, it's not the "global sysctl" as much as the "global
> request_module()" that annoys me.
>
> I'll happily take the request_module_cap() part and the thing that
> makes networking use that.
>
> But the flag that we have to default to off because it breaks every
> single box otherwise? No. It doesn't  matter if it's one single global
> or just a "global behavior for request_module() for this process" at
> that point, it's still a pointless security flag that is opt-in.

To be clear: such a flag wouldn't doesn't break every system, but I
understand your concern.

So what's the right path forward for allowing a way to block
autoloading? Separate existing request_module() calls into "must be
privileged" and "can be unpriv" first, then rework the series to deal
with the "unpriv okay" subset?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-28 20:20                   ` Kees Cook
  (?)
@ 2017-11-28 20:33                     ` Linus Torvalds
  -1 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-28 20:33 UTC (permalink / raw)
  To: Kees Cook
  Cc: Theodore Ts'o, Djalal Harouni, Jonathan Corbet, James Morris,
	LSM List, Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 12:20 PM, Kees Cook <keescook@chromium.org> wrote:
>
> So what's the right path forward for allowing a way to block
> autoloading? Separate existing request_module() calls into "must be
> privileged" and "can be unpriv" first, then rework the series to deal
> with the "unpriv okay" subset?

So once we've taken care of the networking ones that check their own
different capability bit, maybe we can then make the regular
request_module() do a rate-limited warning for non-CAP_SYS_MODULE uses
that prints which module it's loading.

And then just see what people report.

Because maybe it's just a very small handful that matters, and we can
say "those are ok".

And maybe that is too optimistic, and we have a lot of device driver
ones because people still have a static /dev and don't have udev
populating modules and device nodes, and then maybe we need to
introduce a "request_module_dev()" where the rule is that you had to
at least have privileges to open the device node.

Because I really am *not* interested in these security flags that are
off by default and then get turned on by special cases. I think it's
completely unacceptable to say "we're insecure by default but then you
can do X and be secure". It doesn't work. It doesn't fix anything.

                 Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 20:33                     ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-28 20:33 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 12:20 PM, Kees Cook <keescook@chromium.org> wrote:
>
> So what's the right path forward for allowing a way to block
> autoloading? Separate existing request_module() calls into "must be
> privileged" and "can be unpriv" first, then rework the series to deal
> with the "unpriv okay" subset?

So once we've taken care of the networking ones that check their own
different capability bit, maybe we can then make the regular
request_module() do a rate-limited warning for non-CAP_SYS_MODULE uses
that prints which module it's loading.

And then just see what people report.

Because maybe it's just a very small handful that matters, and we can
say "those are ok".

And maybe that is too optimistic, and we have a lot of device driver
ones because people still have a static /dev and don't have udev
populating modules and device nodes, and then maybe we need to
introduce a "request_module_dev()" where the rule is that you had to
at least have privileges to open the device node.

Because I really am *not* interested in these security flags that are
off by default and then get turned on by special cases. I think it's
completely unacceptable to say "we're insecure by default but then you
can do X and be secure". It doesn't work. It doesn't fix anything.

                 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 20:33                     ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-28 20:33 UTC (permalink / raw)
  To: Kees Cook
  Cc: Theodore Ts'o, Djalal Harouni, Jonathan Corbet, James Morris,
	LSM List, Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 12:20 PM, Kees Cook <keescook@chromium.org> wrote:
>
> So what's the right path forward for allowing a way to block
> autoloading? Separate existing request_module() calls into "must be
> privileged" and "can be unpriv" first, then rework the series to deal
> with the "unpriv okay" subset?

So once we've taken care of the networking ones that check their own
different capability bit, maybe we can then make the regular
request_module() do a rate-limited warning for non-CAP_SYS_MODULE uses
that prints which module it's loading.

And then just see what people report.

Because maybe it's just a very small handful that matters, and we can
say "those are ok".

And maybe that is too optimistic, and we have a lot of device driver
ones because people still have a static /dev and don't have udev
populating modules and device nodes, and then maybe we need to
introduce a "request_module_dev()" where the rule is that you had to
at least have privileges to open the device node.

Because I really am *not* interested in these security flags that are
off by default and then get turned on by special cases. I think it's
completely unacceptable to say "we're insecure by default but then you
can do X and be secure". It doesn't work. It doesn't fix anything.

                 Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-28 20:33                     ` Linus Torvalds
  (?)
@ 2017-11-28 21:10                       ` Djalal Harouni
  -1 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-28 21:10 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, Theodore Ts'o, Jonathan Corbet, James Morris,
	LSM List, Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 9:33 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Nov 28, 2017 at 12:20 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>> So what's the right path forward for allowing a way to block
>> autoloading? Separate existing request_module() calls into "must be
>> privileged" and "can be unpriv" first, then rework the series to deal
>> with the "unpriv okay" subset?
>
> So once we've taken care of the networking ones that check their own
> different capability bit, maybe we can then make the regular
> request_module() do a rate-limited warning for non-CAP_SYS_MODULE uses
> that prints which module it's loading.

Alright, I can start by those.


> And then just see what people report.
>
> Because maybe it's just a very small handful that matters, and we can
> say "those are ok".

The ones that are in the cover letter, etc may not have the
appropriate context, the request_module_dev() sure can be made since
if you can open you already have the context.


> And maybe that is too optimistic, and we have a lot of device driver
> ones because people still have a static /dev and don't have udev
> populating modules and device nodes, and then maybe we need to
> introduce a "request_module_dev()" where the rule is that you had to
> at least have privileges to open the device node.
>
> Because I really am *not* interested in these security flags that are
> off by default and then get turned on by special cases. I think it's
> completely unacceptable to say "we're insecure by default but then you
> can do X and be secure". It doesn't work. It doesn't fix anything.

this still leaves all the cases where we don't have the appropriate
context and other implicit loads that are triggered by another
implicit load, etc.

Also the simple local flag is easy to grasp, with real users for it,
and we can abstract on top with load "newfeatures" of course this does
not mean that we should say "we-re insecure by default". I want it to
be more allow newfeatures or not for apps and users... requiring caps
may give users the idea to pass CAP_SYS_MODULE or other caps for
something that used to work, they may start giving it if we break lot
of usecases, and yeh the caps are much broader and do much more
harm...

Ok, so beside updating with request_module_cap() I will investigate
request_module_dev() and we can see

Thanks!

>                  Linus



-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 21:10                       ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-28 21:10 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 9:33 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Nov 28, 2017 at 12:20 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>> So what's the right path forward for allowing a way to block
>> autoloading? Separate existing request_module() calls into "must be
>> privileged" and "can be unpriv" first, then rework the series to deal
>> with the "unpriv okay" subset?
>
> So once we've taken care of the networking ones that check their own
> different capability bit, maybe we can then make the regular
> request_module() do a rate-limited warning for non-CAP_SYS_MODULE uses
> that prints which module it's loading.

Alright, I can start by those.


> And then just see what people report.
>
> Because maybe it's just a very small handful that matters, and we can
> say "those are ok".

The ones that are in the cover letter, etc may not have the
appropriate context, the request_module_dev() sure can be made since
if you can open you already have the context.


> And maybe that is too optimistic, and we have a lot of device driver
> ones because people still have a static /dev and don't have udev
> populating modules and device nodes, and then maybe we need to
> introduce a "request_module_dev()" where the rule is that you had to
> at least have privileges to open the device node.
>
> Because I really am *not* interested in these security flags that are
> off by default and then get turned on by special cases. I think it's
> completely unacceptable to say "we're insecure by default but then you
> can do X and be secure". It doesn't work. It doesn't fix anything.

this still leaves all the cases where we don't have the appropriate
context and other implicit loads that are triggered by another
implicit load, etc.

Also the simple local flag is easy to grasp, with real users for it,
and we can abstract on top with load "newfeatures" of course this does
not mean that we should say "we-re insecure by default". I want it to
be more allow newfeatures or not for apps and users... requiring caps
may give users the idea to pass CAP_SYS_MODULE or other caps for
something that used to work, they may start giving it if we break lot
of usecases, and yeh the caps are much broader and do much more
harm...

Ok, so beside updating with request_module_cap() I will investigate
request_module_dev() and we can see

Thanks!

>                  Linus



-- 
tixxdz
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 21:10                       ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-28 21:10 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, Theodore Ts'o, Jonathan Corbet, James Morris,
	LSM List, Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 9:33 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Nov 28, 2017 at 12:20 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>> So what's the right path forward for allowing a way to block
>> autoloading? Separate existing request_module() calls into "must be
>> privileged" and "can be unpriv" first, then rework the series to deal
>> with the "unpriv okay" subset?
>
> So once we've taken care of the networking ones that check their own
> different capability bit, maybe we can then make the regular
> request_module() do a rate-limited warning for non-CAP_SYS_MODULE uses
> that prints which module it's loading.

Alright, I can start by those.


> And then just see what people report.
>
> Because maybe it's just a very small handful that matters, and we can
> say "those are ok".

The ones that are in the cover letter, etc may not have the
appropriate context, the request_module_dev() sure can be made since
if you can open you already have the context.


> And maybe that is too optimistic, and we have a lot of device driver
> ones because people still have a static /dev and don't have udev
> populating modules and device nodes, and then maybe we need to
> introduce a "request_module_dev()" where the rule is that you had to
> at least have privileges to open the device node.
>
> Because I really am *not* interested in these security flags that are
> off by default and then get turned on by special cases. I think it's
> completely unacceptable to say "we're insecure by default but then you
> can do X and be secure". It doesn't work. It doesn't fix anything.

this still leaves all the cases where we don't have the appropriate
context and other implicit loads that are triggered by another
implicit load, etc.

Also the simple local flag is easy to grasp, with real users for it,
and we can abstract on top with load "newfeatures" of course this does
not mean that we should say "we-re insecure by default". I want it to
be more allow newfeatures or not for apps and users... requiring caps
may give users the idea to pass CAP_SYS_MODULE or other caps for
something that used to work, they may start giving it if we break lot
of usecases, and yeh the caps are much broader and do much more
harm...

Ok, so beside updating with request_module_cap() I will investigate
request_module_dev() and we can see

Thanks!

>                  Linus



-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-28 20:11       ` Kees Cook
  (?)
@ 2017-11-28 21:16         ` Luis R. Rodriguez
  -1 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-28 21:16 UTC (permalink / raw)
  To: Kees Cook
  Cc: Luis R. Rodriguez, Djalal Harouni, Andy Lutomirski,
	Andrew Morton, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell, LKML,
	linux-security-module, kernel-hardening, Jonathan Corbet,
	Ingo Molnar, David S. Miller, Network Development,
	Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 12:11:34PM -0800, Kees Cook wrote:
> On Tue, Nov 28, 2017 at 11:14 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > kmod is just a helper to poke userpsace to load a module, that's it.
> >
> > The old init_module() and newer finit_module() do the real handy work or
> > module loading, and both currently only use may_init_module():
> >
> > static int may_init_module(void)
> > {
> >         if (!capable(CAP_SYS_MODULE) || modules_disabled)
> >                 return -EPERM;
> >
> >         return 0;
> > }
> >
> > This begs the question:
> >
> >   o If userspace just tries to just use raw finit_module() do we want similar
> >     checks?
> >
> > Otherwise, correct me if I'm wrong this all seems pointless.
> 
> Hm? That's direct-loading, not auto-loading. This series is only about
> auto-loading.

And *all* auto-loading uses aliases? What's the difference between auto-loading
and direct-loading?

> We already have a global sysctl for blocking direct-loading (modules_disabled).

My point was that even if you have a CAP_NET_ADMIN check on request_module(),
finit_module() will not check for it, so a crafty userspace could still try
to just finit_module() directly, and completely then bypass the CAP_NET_ADMIN
check.

So unless I'm missing something, I see no point in adding extra checks for
request_module() but nothing for the respective load_module().

  Luis

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 21:16         ` Luis R. Rodriguez
  0 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-28 21:16 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 12:11:34PM -0800, Kees Cook wrote:
> On Tue, Nov 28, 2017 at 11:14 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > kmod is just a helper to poke userpsace to load a module, that's it.
> >
> > The old init_module() and newer finit_module() do the real handy work or
> > module loading, and both currently only use may_init_module():
> >
> > static int may_init_module(void)
> > {
> >         if (!capable(CAP_SYS_MODULE) || modules_disabled)
> >                 return -EPERM;
> >
> >         return 0;
> > }
> >
> > This begs the question:
> >
> >   o If userspace just tries to just use raw finit_module() do we want similar
> >     checks?
> >
> > Otherwise, correct me if I'm wrong this all seems pointless.
> 
> Hm? That's direct-loading, not auto-loading. This series is only about
> auto-loading.

And *all* auto-loading uses aliases? What's the difference between auto-loading
and direct-loading?

> We already have a global sysctl for blocking direct-loading (modules_disabled).

My point was that even if you have a CAP_NET_ADMIN check on request_module(),
finit_module() will not check for it, so a crafty userspace could still try
to just finit_module() directly, and completely then bypass the CAP_NET_ADMIN
check.

So unless I'm missing something, I see no point in adding extra checks for
request_module() but nothing for the respective load_module().

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 21:16         ` Luis R. Rodriguez
  0 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-28 21:16 UTC (permalink / raw)
  To: Kees Cook
  Cc: Luis R. Rodriguez, Djalal Harouni, Andy Lutomirski,
	Andrew Morton, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell, LKML,
	linux-security-module, kernel-hardening, Jonathan Corbet,
	Ingo Molnar, David S. Miller, Network Development,
	Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 12:11:34PM -0800, Kees Cook wrote:
> On Tue, Nov 28, 2017 at 11:14 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > kmod is just a helper to poke userpsace to load a module, that's it.
> >
> > The old init_module() and newer finit_module() do the real handy work or
> > module loading, and both currently only use may_init_module():
> >
> > static int may_init_module(void)
> > {
> >         if (!capable(CAP_SYS_MODULE) || modules_disabled)
> >                 return -EPERM;
> >
> >         return 0;
> > }
> >
> > This begs the question:
> >
> >   o If userspace just tries to just use raw finit_module() do we want similar
> >     checks?
> >
> > Otherwise, correct me if I'm wrong this all seems pointless.
> 
> Hm? That's direct-loading, not auto-loading. This series is only about
> auto-loading.

And *all* auto-loading uses aliases? What's the difference between auto-loading
and direct-loading?

> We already have a global sysctl for blocking direct-loading (modules_disabled).

My point was that even if you have a CAP_NET_ADMIN check on request_module(),
finit_module() will not check for it, so a crafty userspace could still try
to just finit_module() directly, and completely then bypass the CAP_NET_ADMIN
check.

So unless I'm missing something, I see no point in adding extra checks for
request_module() but nothing for the respective load_module().

  Luis

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-28 21:16         ` Luis R. Rodriguez
  (?)
@ 2017-11-28 21:33           ` Djalal Harouni
  -1 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-28 21:33 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, James Morris,
	Ben Hutchings, Solar Designer, Serge Hallyn, Jessica Yu,
	Rusty Russell, LKML, linux-security-module, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 10:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Tue, Nov 28, 2017 at 12:11:34PM -0800, Kees Cook wrote:
>> On Tue, Nov 28, 2017 at 11:14 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> > kmod is just a helper to poke userpsace to load a module, that's it.
>> >
>> > The old init_module() and newer finit_module() do the real handy work or
>> > module loading, and both currently only use may_init_module():
>> >
>> > static int may_init_module(void)
>> > {
>> >         if (!capable(CAP_SYS_MODULE) || modules_disabled)
>> >                 return -EPERM;
>> >
>> >         return 0;
>> > }
>> >
>> > This begs the question:
>> >
>> >   o If userspace just tries to just use raw finit_module() do we want similar
>> >     checks?
>> >
>> > Otherwise, correct me if I'm wrong this all seems pointless.
>>
>> Hm? That's direct-loading, not auto-loading. This series is only about
>> auto-loading.
>
> And *all* auto-loading uses aliases? What's the difference between auto-loading
> and direct-loading?

Not all auto-loading uses aliases, auto-loading is when kernel code
calls request_module() to loads the feature that was not present, and
direct-loading in this thread is the direct syscalls like
finit_module().

>> We already have a global sysctl for blocking direct-loading (modules_disabled).
>
> My point was that even if you have a CAP_NET_ADMIN check on request_module(),
> finit_module() will not check for it, so a crafty userspace could still try
> to just finit_module() directly, and completely then bypass the CAP_NET_ADMIN
> check.

The finit_module() uses CAP_SYS_MODULE which should allow all modules
and in this context it should be more privileged than CAP_NET_ADMIN
which is only for "netdev-%s" (to not load arbitrary modules with it).

finit_module() coming from request_module() always has the
CAP_NET_ADMIN, hence the check is done before.

> So unless I'm missing something, I see no point in adding extra checks for
> request_module() but nothing for the respective load_module().

I see, request_module() is called from kernel context which runs in
init namespace will full capabilities, the spawned userspace modprobe
will get CAP_SYS_MODULE and all other caps, then after comes modprobe
and load_module().

Btw as suggested by Linus I will update with request_module_cap() and
I can offer my help maintaining these bits too.


>
>   Luis



-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 21:33           ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-28 21:33 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 10:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Tue, Nov 28, 2017 at 12:11:34PM -0800, Kees Cook wrote:
>> On Tue, Nov 28, 2017 at 11:14 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> > kmod is just a helper to poke userpsace to load a module, that's it.
>> >
>> > The old init_module() and newer finit_module() do the real handy work or
>> > module loading, and both currently only use may_init_module():
>> >
>> > static int may_init_module(void)
>> > {
>> >         if (!capable(CAP_SYS_MODULE) || modules_disabled)
>> >                 return -EPERM;
>> >
>> >         return 0;
>> > }
>> >
>> > This begs the question:
>> >
>> >   o If userspace just tries to just use raw finit_module() do we want similar
>> >     checks?
>> >
>> > Otherwise, correct me if I'm wrong this all seems pointless.
>>
>> Hm? That's direct-loading, not auto-loading. This series is only about
>> auto-loading.
>
> And *all* auto-loading uses aliases? What's the difference between auto-loading
> and direct-loading?

Not all auto-loading uses aliases, auto-loading is when kernel code
calls request_module() to loads the feature that was not present, and
direct-loading in this thread is the direct syscalls like
finit_module().

>> We already have a global sysctl for blocking direct-loading (modules_disabled).
>
> My point was that even if you have a CAP_NET_ADMIN check on request_module(),
> finit_module() will not check for it, so a crafty userspace could still try
> to just finit_module() directly, and completely then bypass the CAP_NET_ADMIN
> check.

The finit_module() uses CAP_SYS_MODULE which should allow all modules
and in this context it should be more privileged than CAP_NET_ADMIN
which is only for "netdev-%s" (to not load arbitrary modules with it).

finit_module() coming from request_module() always has the
CAP_NET_ADMIN, hence the check is done before.

> So unless I'm missing something, I see no point in adding extra checks for
> request_module() but nothing for the respective load_module().

I see, request_module() is called from kernel context which runs in
init namespace will full capabilities, the spawned userspace modprobe
will get CAP_SYS_MODULE and all other caps, then after comes modprobe
and load_module().

Btw as suggested by Linus I will update with request_module_cap() and
I can offer my help maintaining these bits too.


>
>   Luis



-- 
tixxdz
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 21:33           ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-28 21:33 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, James Morris,
	Ben Hutchings, Solar Designer, Serge Hallyn, Jessica Yu,
	Rusty Russell, LKML, linux-security-module, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 10:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Tue, Nov 28, 2017 at 12:11:34PM -0800, Kees Cook wrote:
>> On Tue, Nov 28, 2017 at 11:14 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> > kmod is just a helper to poke userpsace to load a module, that's it.
>> >
>> > The old init_module() and newer finit_module() do the real handy work or
>> > module loading, and both currently only use may_init_module():
>> >
>> > static int may_init_module(void)
>> > {
>> >         if (!capable(CAP_SYS_MODULE) || modules_disabled)
>> >                 return -EPERM;
>> >
>> >         return 0;
>> > }
>> >
>> > This begs the question:
>> >
>> >   o If userspace just tries to just use raw finit_module() do we want similar
>> >     checks?
>> >
>> > Otherwise, correct me if I'm wrong this all seems pointless.
>>
>> Hm? That's direct-loading, not auto-loading. This series is only about
>> auto-loading.
>
> And *all* auto-loading uses aliases? What's the difference between auto-loading
> and direct-loading?

Not all auto-loading uses aliases, auto-loading is when kernel code
calls request_module() to loads the feature that was not present, and
direct-loading in this thread is the direct syscalls like
finit_module().

>> We already have a global sysctl for blocking direct-loading (modules_disabled).
>
> My point was that even if you have a CAP_NET_ADMIN check on request_module(),
> finit_module() will not check for it, so a crafty userspace could still try
> to just finit_module() directly, and completely then bypass the CAP_NET_ADMIN
> check.

The finit_module() uses CAP_SYS_MODULE which should allow all modules
and in this context it should be more privileged than CAP_NET_ADMIN
which is only for "netdev-%s" (to not load arbitrary modules with it).

finit_module() coming from request_module() always has the
CAP_NET_ADMIN, hence the check is done before.

> So unless I'm missing something, I see no point in adding extra checks for
> request_module() but nothing for the respective load_module().

I see, request_module() is called from kernel context which runs in
init namespace will full capabilities, the spawned userspace modprobe
will get CAP_SYS_MODULE and all other caps, then after comes modprobe
and load_module().

Btw as suggested by Linus I will update with request_module_cap() and
I can offer my help maintaining these bits too.


>
>   Luis



-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-28 20:33                     ` Linus Torvalds
  (?)
@ 2017-11-28 21:33                       ` Kees Cook
  -1 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 21:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Theodore Ts'o, Djalal Harouni, Jonathan Corbet, James Morris,
	LSM List, Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 12:33 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Nov 28, 2017 at 12:20 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>> So what's the right path forward for allowing a way to block
>> autoloading? Separate existing request_module() calls into "must be
>> privileged" and "can be unpriv" first, then rework the series to deal
>> with the "unpriv okay" subset?
>
> So once we've taken care of the networking ones that check their own
> different capability bit, maybe we can then make the regular
> request_module() do a rate-limited warning for non-CAP_SYS_MODULE uses
> that prints which module it's loading.
>
> And then just see what people report.

I'm fine to try this experiment. I just think it'll be very noisy. :)

> Because maybe it's just a very small handful that matters, and we can
> say "those are ok".

See, that's the problem: it's not a matter of "those are ok". Those
modules will either have bugs discovered in them or will gain bugs
over time. The point is to provide a mechanism to block an entire
attack surface on systems that do not need auto-loading (which is a
lot of systems).

As I've said before, this isn't a theoretical attack surface. This
year alone there have been three known-exploitable flaws exposed by
autoloading:

The exploit for CVE-2017-2636 uses int n_hdlc = N_HDLC; ioctl(fd,
TIOCSETD, &n_hdlc) [1]. This is using the existing "tty-ldisc-"
prefix, and is intentionally unprivileged.

The exploit for CVE-2017-6074 uses socket(PF_INET6, SOCK_DCCP,
IPPROTO_IP) [2]. This is using the existing proto prefix, and is
intentionally unprivileged.

The exploit for CVE-2017-7184 uses socket(PF_NETLINK, SOCK_RAW,
NETLINK_XFRM) [3]. This is using existing proto prefix and became
unprivileged by way of user namespaces (CAP_NET_ADMIN in a
user-namespace).

Concerned admins have already disabled user namespaces, so they were
protected from the last one, but they couldn't stop the first two
without explicit blacklists ahead of time. (Though anyone paying
attention to vulnerability histories would have already blacklisted
DCCP.)

> And maybe that is too optimistic, and we have a lot of device driver
> ones because people still have a static /dev and don't have udev
> populating modules and device nodes, and then maybe we need to
> introduce a "request_module_dev()" where the rule is that you had to
> at least have privileges to open the device node.
>
> Because I really am *not* interested in these security flags that are
> off by default and then get turned on by special cases. I think it's
> completely unacceptable to say "we're insecure by default but then you
> can do X and be secure". It doesn't work. It doesn't fix anything.

I empathize with this stance, but the last many years of hardening has
been doing this just fine, and we've progressively moved defenses from
"off by default" to "on by default", by way of distros, etc, over
time. But sometimes we have things that must stay off by default to
avoid breaking rule #1, even though every distro has turned it on by
default (e.g. link restrictions). So, saying "it must be on by default
and not break userspace" puts us in a bit of a catch-22.

It sounds like the proposed path forward is:

1) convert existing privileged request_module() usage to use explicit
permission and/or capability checks
2) mark the remaining with WARN_RATELIMIT
3) wait
4) ???

It's steps 3 and 4 that worry me because we have a standing need for a
solution here, and "wait" isn't a very convincing solution and step 4
is unknown. We know we're going to have unprivileged module loading
(e.g. AF_ALG), even if we reduce the places where it's used. In the
end, we will still need a way for a process to say "I don't want to
auto-load modules from here on out". That's the part I want to figure
out, otherwise Djalal doesn't have a way to address the attack
surface.

-Kees

[1] https://github.com/snorez/exploits/blob/master/cve-2017-2636/cve-2017-2636.c
[2] https://github.com/xairy/kernel-exploits/blob/master/CVE-2017-6074/poc.c
[3] https://github.com/snorez/exploits/blob/master/cve-2017-7184/exp.c

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 21:33                       ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 21:33 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 12:33 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Nov 28, 2017 at 12:20 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>> So what's the right path forward for allowing a way to block
>> autoloading? Separate existing request_module() calls into "must be
>> privileged" and "can be unpriv" first, then rework the series to deal
>> with the "unpriv okay" subset?
>
> So once we've taken care of the networking ones that check their own
> different capability bit, maybe we can then make the regular
> request_module() do a rate-limited warning for non-CAP_SYS_MODULE uses
> that prints which module it's loading.
>
> And then just see what people report.

I'm fine to try this experiment. I just think it'll be very noisy. :)

> Because maybe it's just a very small handful that matters, and we can
> say "those are ok".

See, that's the problem: it's not a matter of "those are ok". Those
modules will either have bugs discovered in them or will gain bugs
over time. The point is to provide a mechanism to block an entire
attack surface on systems that do not need auto-loading (which is a
lot of systems).

As I've said before, this isn't a theoretical attack surface. This
year alone there have been three known-exploitable flaws exposed by
autoloading:

The exploit for CVE-2017-2636 uses int n_hdlc = N_HDLC; ioctl(fd,
TIOCSETD, &n_hdlc) [1]. This is using the existing "tty-ldisc-"
prefix, and is intentionally unprivileged.

The exploit for CVE-2017-6074 uses socket(PF_INET6, SOCK_DCCP,
IPPROTO_IP) [2]. This is using the existing proto prefix, and is
intentionally unprivileged.

The exploit for CVE-2017-7184 uses socket(PF_NETLINK, SOCK_RAW,
NETLINK_XFRM) [3]. This is using existing proto prefix and became
unprivileged by way of user namespaces (CAP_NET_ADMIN in a
user-namespace).

Concerned admins have already disabled user namespaces, so they were
protected from the last one, but they couldn't stop the first two
without explicit blacklists ahead of time. (Though anyone paying
attention to vulnerability histories would have already blacklisted
DCCP.)

> And maybe that is too optimistic, and we have a lot of device driver
> ones because people still have a static /dev and don't have udev
> populating modules and device nodes, and then maybe we need to
> introduce a "request_module_dev()" where the rule is that you had to
> at least have privileges to open the device node.
>
> Because I really am *not* interested in these security flags that are
> off by default and then get turned on by special cases. I think it's
> completely unacceptable to say "we're insecure by default but then you
> can do X and be secure". It doesn't work. It doesn't fix anything.

I empathize with this stance, but the last many years of hardening has
been doing this just fine, and we've progressively moved defenses from
"off by default" to "on by default", by way of distros, etc, over
time. But sometimes we have things that must stay off by default to
avoid breaking rule #1, even though every distro has turned it on by
default (e.g. link restrictions). So, saying "it must be on by default
and not break userspace" puts us in a bit of a catch-22.

It sounds like the proposed path forward is:

1) convert existing privileged request_module() usage to use explicit
permission and/or capability checks
2) mark the remaining with WARN_RATELIMIT
3) wait
4) ???

It's steps 3 and 4 that worry me because we have a standing need for a
solution here, and "wait" isn't a very convincing solution and step 4
is unknown. We know we're going to have unprivileged module loading
(e.g. AF_ALG), even if we reduce the places where it's used. In the
end, we will still need a way for a process to say "I don't want to
auto-load modules from here on out". That's the part I want to figure
out, otherwise Djalal doesn't have a way to address the attack
surface.

-Kees

[1] https://github.com/snorez/exploits/blob/master/cve-2017-2636/cve-2017-2636.c
[2] https://github.com/xairy/kernel-exploits/blob/master/CVE-2017-6074/poc.c
[3] https://github.com/snorez/exploits/blob/master/cve-2017-7184/exp.c

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 21:33                       ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 21:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Theodore Ts'o, Djalal Harouni, Jonathan Corbet, James Morris,
	LSM List, Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 12:33 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Nov 28, 2017 at 12:20 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>> So what's the right path forward for allowing a way to block
>> autoloading? Separate existing request_module() calls into "must be
>> privileged" and "can be unpriv" first, then rework the series to deal
>> with the "unpriv okay" subset?
>
> So once we've taken care of the networking ones that check their own
> different capability bit, maybe we can then make the regular
> request_module() do a rate-limited warning for non-CAP_SYS_MODULE uses
> that prints which module it's loading.
>
> And then just see what people report.

I'm fine to try this experiment. I just think it'll be very noisy. :)

> Because maybe it's just a very small handful that matters, and we can
> say "those are ok".

See, that's the problem: it's not a matter of "those are ok". Those
modules will either have bugs discovered in them or will gain bugs
over time. The point is to provide a mechanism to block an entire
attack surface on systems that do not need auto-loading (which is a
lot of systems).

As I've said before, this isn't a theoretical attack surface. This
year alone there have been three known-exploitable flaws exposed by
autoloading:

The exploit for CVE-2017-2636 uses int n_hdlc = N_HDLC; ioctl(fd,
TIOCSETD, &n_hdlc) [1]. This is using the existing "tty-ldisc-"
prefix, and is intentionally unprivileged.

The exploit for CVE-2017-6074 uses socket(PF_INET6, SOCK_DCCP,
IPPROTO_IP) [2]. This is using the existing proto prefix, and is
intentionally unprivileged.

The exploit for CVE-2017-7184 uses socket(PF_NETLINK, SOCK_RAW,
NETLINK_XFRM) [3]. This is using existing proto prefix and became
unprivileged by way of user namespaces (CAP_NET_ADMIN in a
user-namespace).

Concerned admins have already disabled user namespaces, so they were
protected from the last one, but they couldn't stop the first two
without explicit blacklists ahead of time. (Though anyone paying
attention to vulnerability histories would have already blacklisted
DCCP.)

> And maybe that is too optimistic, and we have a lot of device driver
> ones because people still have a static /dev and don't have udev
> populating modules and device nodes, and then maybe we need to
> introduce a "request_module_dev()" where the rule is that you had to
> at least have privileges to open the device node.
>
> Because I really am *not* interested in these security flags that are
> off by default and then get turned on by special cases. I think it's
> completely unacceptable to say "we're insecure by default but then you
> can do X and be secure". It doesn't work. It doesn't fix anything.

I empathize with this stance, but the last many years of hardening has
been doing this just fine, and we've progressively moved defenses from
"off by default" to "on by default", by way of distros, etc, over
time. But sometimes we have things that must stay off by default to
avoid breaking rule #1, even though every distro has turned it on by
default (e.g. link restrictions). So, saying "it must be on by default
and not break userspace" puts us in a bit of a catch-22.

It sounds like the proposed path forward is:

1) convert existing privileged request_module() usage to use explicit
permission and/or capability checks
2) mark the remaining with WARN_RATELIMIT
3) wait
4) ???

It's steps 3 and 4 that worry me because we have a standing need for a
solution here, and "wait" isn't a very convincing solution and step 4
is unknown. We know we're going to have unprivileged module loading
(e.g. AF_ALG), even if we reduce the places where it's used. In the
end, we will still need a way for a process to say "I don't want to
auto-load modules from here on out". That's the part I want to figure
out, otherwise Djalal doesn't have a way to address the attack
surface.

-Kees

[1] https://github.com/snorez/exploits/blob/master/cve-2017-2636/cve-2017-2636.c
[2] https://github.com/xairy/kernel-exploits/blob/master/CVE-2017-6074/poc.c
[3] https://github.com/snorez/exploits/blob/master/cve-2017-7184/exp.c

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-28 21:16         ` Luis R. Rodriguez
  (?)
@ 2017-11-28 21:39           ` Kees Cook
  -1 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 21:39 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton, James Morris,
	Ben Hutchings, Solar Designer, Serge Hallyn, Jessica Yu,
	Rusty Russell, LKML, linux-security-module, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 1:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> And *all* auto-loading uses aliases? What's the difference between auto-loading
> and direct-loading?

The difference is the process privileges. Unprivilged autoloading
(e.g. int n_hdlc = N_HDLC; ioctl(fd,
TIOCSETD, &n_hdlc)), triggers a privileged call to finit_module()
under CAP_SYS_MODULE.

>> We already have a global sysctl for blocking direct-loading (modules_disabled).
>
> My point was that even if you have a CAP_NET_ADMIN check on request_module(),
> finit_module() will not check for it, so a crafty userspace could still try
> to just finit_module() directly, and completely then bypass the CAP_NET_ADMIN
> check.

You need CAP_SYS_MODULE to run finit_module().

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 21:39           ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 21:39 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 1:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> And *all* auto-loading uses aliases? What's the difference between auto-loading
> and direct-loading?

The difference is the process privileges. Unprivilged autoloading
(e.g. int n_hdlc = N_HDLC; ioctl(fd,
TIOCSETD, &n_hdlc)), triggers a privileged call to finit_module()
under CAP_SYS_MODULE.

>> We already have a global sysctl for blocking direct-loading (modules_disabled).
>
> My point was that even if you have a CAP_NET_ADMIN check on request_module(),
> finit_module() will not check for it, so a crafty userspace could still try
> to just finit_module() directly, and completely then bypass the CAP_NET_ADMIN
> check.

You need CAP_SYS_MODULE to run finit_module().

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 21:39           ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 21:39 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton, James Morris,
	Ben Hutchings, Solar Designer, Serge Hallyn, Jessica Yu,
	Rusty Russell, LKML, linux-security-module, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 1:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> And *all* auto-loading uses aliases? What's the difference between auto-loading
> and direct-loading?

The difference is the process privileges. Unprivilged autoloading
(e.g. int n_hdlc = N_HDLC; ioctl(fd,
TIOCSETD, &n_hdlc)), triggers a privileged call to finit_module()
under CAP_SYS_MODULE.

>> We already have a global sysctl for blocking direct-loading (modules_disabled).
>
> My point was that even if you have a CAP_NET_ADMIN check on request_module(),
> finit_module() will not check for it, so a crafty userspace could still try
> to just finit_module() directly, and completely then bypass the CAP_NET_ADMIN
> check.

You need CAP_SYS_MODULE to run finit_module().

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-28 20:33                     ` Linus Torvalds
  (?)
@ 2017-11-28 21:51                       ` Geo Kozey
  -1 siblings, 0 replies; 266+ messages in thread
From: Geo Kozey @ 2017-11-28 21:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: LSM List, Linux Kernel Mailing List, kernel-hardening

> From: Linus Torvalds <torvalds@linux-foundation.org>
> Sent: Tue Nov 28 21:33:22 CET 2017
> To: Kees Cook <keescook@chromium.org>
> Subject: Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules

> Because I really am *not* interested in these security flags that are
> off by default and then get turned on by special cases. I think it's
> completely unacceptable to say "we're insecure by default but then you
> can do X and be secure". It doesn't work. It doesn't fix anything.
> 
>                  Linus
> ----------------------------------------

What about "we're insecure by default but you can't do anything to change this"? It describes current situation. For last 20 years linux allowed for insecure behavior and tons of tools were built depending on it. It's recurring theme of kernel security development. I'll be glad if some genius propose perfect idea solving this problem but I'm afraid things go nowhere instead.

Yours sincerely

G. K.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 21:51                       ` Geo Kozey
  0 siblings, 0 replies; 266+ messages in thread
From: Geo Kozey @ 2017-11-28 21:51 UTC (permalink / raw)
  To: linux-security-module

> From: Linus Torvalds <torvalds@linux-foundation.org>
> Sent: Tue Nov 28 21:33:22 CET 2017
> To: Kees Cook <keescook@chromium.org>
> Subject: Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules

> Because I really am *not* interested in these security flags that are
> off by default and then get turned on by special cases. I think it's
> completely unacceptable to say "we're insecure by default but then you
> can do X and be secure". It doesn't work. It doesn't fix anything.
> 
>                  Linus
> ----------------------------------------

What about "we're insecure by default but you can't do anything to change this"? It describes current situation. For last 20 years linux allowed for insecure behavior and tons of tools were built depending on it. It's recurring theme of kernel security development. I'll be glad if some genius propose perfect idea solving this problem but I'm afraid things go nowhere instead.

Yours sincerely

G. K.
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 21:51                       ` Geo Kozey
  0 siblings, 0 replies; 266+ messages in thread
From: Geo Kozey @ 2017-11-28 21:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: LSM List, Linux Kernel Mailing List, kernel-hardening

> From: Linus Torvalds <torvalds@linux-foundation.org>
> Sent: Tue Nov 28 21:33:22 CET 2017
> To: Kees Cook <keescook@chromium.org>
> Subject: Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules

> Because I really am *not* interested in these security flags that are
> off by default and then get turned on by special cases. I think it's
> completely unacceptable to say "we're insecure by default but then you
> can do X and be secure". It doesn't work. It doesn't fix anything.
> 
>                  Linus
> ----------------------------------------

What about "we're insecure by default but you can't do anything to change this"? It describes current situation. For last 20 years linux allowed for insecure behavior and tons of tools were built depending on it. It's recurring theme of kernel security development. I'll be glad if some genius propose perfect idea solving this problem but I'm afraid things go nowhere instead.

Yours sincerely

G. K.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-28 21:39           ` Kees Cook
  (?)
@ 2017-11-28 22:12             ` Luis R. Rodriguez
  -1 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-28 22:12 UTC (permalink / raw)
  To: Kees Cook
  Cc: Luis R. Rodriguez, Djalal Harouni, Andy Lutomirski,
	Andrew Morton, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell, LKML,
	linux-security-module, kernel-hardening, Jonathan Corbet,
	Ingo Molnar, David S. Miller, Network Development,
	Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 01:39:58PM -0800, Kees Cook wrote:
> On Tue, Nov 28, 2017 at 1:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > And *all* auto-loading uses aliases? What's the difference between auto-loading
> > and direct-loading?
> 
> The difference is the process privileges. Unprivilged autoloading
> (e.g. int n_hdlc = N_HDLC; ioctl(fd,
> TIOCSETD, &n_hdlc)), triggers a privileged call to finit_module()
> under CAP_SYS_MODULE.

Ah, so system call implicated request_module() calls.

> >> We already have a global sysctl for blocking direct-loading (modules_disabled).
> >
> > My point was that even if you have a CAP_NET_ADMIN check on request_module(),
> > finit_module() will not check for it, so a crafty userspace could still try
> > to just finit_module() directly, and completely then bypass the CAP_NET_ADMIN
> > check.
> 
> You need CAP_SYS_MODULE to run finit_module().

OK and since CAP_SYS_MODULE is much more restrictive one could argue, what's the
point here?

  Luis

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 22:12             ` Luis R. Rodriguez
  0 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-28 22:12 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 01:39:58PM -0800, Kees Cook wrote:
> On Tue, Nov 28, 2017 at 1:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > And *all* auto-loading uses aliases? What's the difference between auto-loading
> > and direct-loading?
> 
> The difference is the process privileges. Unprivilged autoloading
> (e.g. int n_hdlc = N_HDLC; ioctl(fd,
> TIOCSETD, &n_hdlc)), triggers a privileged call to finit_module()
> under CAP_SYS_MODULE.

Ah, so system call implicated request_module() calls.

> >> We already have a global sysctl for blocking direct-loading (modules_disabled).
> >
> > My point was that even if you have a CAP_NET_ADMIN check on request_module(),
> > finit_module() will not check for it, so a crafty userspace could still try
> > to just finit_module() directly, and completely then bypass the CAP_NET_ADMIN
> > check.
> 
> You need CAP_SYS_MODULE to run finit_module().

OK and since CAP_SYS_MODULE is much more restrictive one could argue, what's the
point here?

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 22:12             ` Luis R. Rodriguez
  0 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-28 22:12 UTC (permalink / raw)
  To: Kees Cook
  Cc: Luis R. Rodriguez, Djalal Harouni, Andy Lutomirski,
	Andrew Morton, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell, LKML,
	linux-security-module, kernel-hardening, Jonathan Corbet,
	Ingo Molnar, David S. Miller, Network Development,
	Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 01:39:58PM -0800, Kees Cook wrote:
> On Tue, Nov 28, 2017 at 1:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > And *all* auto-loading uses aliases? What's the difference between auto-loading
> > and direct-loading?
> 
> The difference is the process privileges. Unprivilged autoloading
> (e.g. int n_hdlc = N_HDLC; ioctl(fd,
> TIOCSETD, &n_hdlc)), triggers a privileged call to finit_module()
> under CAP_SYS_MODULE.

Ah, so system call implicated request_module() calls.

> >> We already have a global sysctl for blocking direct-loading (modules_disabled).
> >
> > My point was that even if you have a CAP_NET_ADMIN check on request_module(),
> > finit_module() will not check for it, so a crafty userspace could still try
> > to just finit_module() directly, and completely then bypass the CAP_NET_ADMIN
> > check.
> 
> You need CAP_SYS_MODULE to run finit_module().

OK and since CAP_SYS_MODULE is much more restrictive one could argue, what's the
point here?

  Luis

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-28 22:12             ` Luis R. Rodriguez
  (?)
@ 2017-11-28 22:18               ` Kees Cook
  -1 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 22:18 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton, James Morris,
	Ben Hutchings, Solar Designer, Serge Hallyn, Jessica Yu,
	Rusty Russell, LKML, linux-security-module, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 2:12 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Tue, Nov 28, 2017 at 01:39:58PM -0800, Kees Cook wrote:
>> On Tue, Nov 28, 2017 at 1:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> > And *all* auto-loading uses aliases? What's the difference between auto-loading
>> > and direct-loading?
>>
>> The difference is the process privileges. Unprivilged autoloading
>> (e.g. int n_hdlc = N_HDLC; ioctl(fd,
>> TIOCSETD, &n_hdlc)), triggers a privileged call to finit_module()
>> under CAP_SYS_MODULE.
>
> Ah, so system call implicated request_module() calls.

Yup. Unprivileged user does something that ultimately hits a
request_module() in the kernel. Then the kernel calls out with the
usermode helper (which has CAP_SYS_MODULE) and calls finit_module().

> OK and since CAP_SYS_MODULE is much more restrictive one could argue, what's the
> point here?

The goal is to block an unprivileged user from being able to trigger a
module load without blocking root from loading modules directly.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 22:18               ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 22:18 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 2:12 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Tue, Nov 28, 2017 at 01:39:58PM -0800, Kees Cook wrote:
>> On Tue, Nov 28, 2017 at 1:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> > And *all* auto-loading uses aliases? What's the difference between auto-loading
>> > and direct-loading?
>>
>> The difference is the process privileges. Unprivilged autoloading
>> (e.g. int n_hdlc = N_HDLC; ioctl(fd,
>> TIOCSETD, &n_hdlc)), triggers a privileged call to finit_module()
>> under CAP_SYS_MODULE.
>
> Ah, so system call implicated request_module() calls.

Yup. Unprivileged user does something that ultimately hits a
request_module() in the kernel. Then the kernel calls out with the
usermode helper (which has CAP_SYS_MODULE) and calls finit_module().

> OK and since CAP_SYS_MODULE is much more restrictive one could argue, what's the
> point here?

The goal is to block an unprivileged user from being able to trigger a
module load without blocking root from loading modules directly.

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 22:18               ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 22:18 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Djalal Harouni, Andy Lutomirski, Andrew Morton, James Morris,
	Ben Hutchings, Solar Designer, Serge Hallyn, Jessica Yu,
	Rusty Russell, LKML, linux-security-module, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 2:12 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Tue, Nov 28, 2017 at 01:39:58PM -0800, Kees Cook wrote:
>> On Tue, Nov 28, 2017 at 1:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> > And *all* auto-loading uses aliases? What's the difference between auto-loading
>> > and direct-loading?
>>
>> The difference is the process privileges. Unprivilged autoloading
>> (e.g. int n_hdlc = N_HDLC; ioctl(fd,
>> TIOCSETD, &n_hdlc)), triggers a privileged call to finit_module()
>> under CAP_SYS_MODULE.
>
> Ah, so system call implicated request_module() calls.

Yup. Unprivileged user does something that ultimately hits a
request_module() in the kernel. Then the kernel calls out with the
usermode helper (which has CAP_SYS_MODULE) and calls finit_module().

> OK and since CAP_SYS_MODULE is much more restrictive one could argue, what's the
> point here?

The goal is to block an unprivileged user from being able to trigger a
module load without blocking root from loading modules directly.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-28 21:33           ` Djalal Harouni
  (?)
@ 2017-11-28 22:18             ` Luis R. Rodriguez
  -1 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-28 22:18 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Luis R. Rodriguez, Kees Cook, Andy Lutomirski, Andrew Morton,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, LKML, linux-security-module,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 10:33:27PM +0100, Djalal Harouni wrote:
> On Tue, Nov 28, 2017 at 10:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > On Tue, Nov 28, 2017 at 12:11:34PM -0800, Kees Cook wrote:
> >> On Tue, Nov 28, 2017 at 11:14 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> >> > kmod is just a helper to poke userpsace to load a module, that's it.
> >> >
> >> > The old init_module() and newer finit_module() do the real handy work or
> >> > module loading, and both currently only use may_init_module():
> >> >
> >> > static int may_init_module(void)
> >> > {
> >> >         if (!capable(CAP_SYS_MODULE) || modules_disabled)
> >> >                 return -EPERM;
> >> >
> >> >         return 0;
> >> > }
> >> >
> >> > This begs the question:
> >> >
> >> >   o If userspace just tries to just use raw finit_module() do we want similar
> >> >     checks?
> >> >
> >> > Otherwise, correct me if I'm wrong this all seems pointless.
> >>
> >> Hm? That's direct-loading, not auto-loading. This series is only about
> >> auto-loading.
> >
> > And *all* auto-loading uses aliases? What's the difference between auto-loading
> > and direct-loading?
> 
> Not all auto-loading uses aliases, auto-loading is when kernel code
> calls request_module() to loads the feature that was not present, 

It seems the actual interest here is system call implicated request_module()
calls? Because there are uses of request_module() which may be module hacks,
and not implicated via system calls.

> and direct-loading in this thread is the direct syscalls like
> finit_module().

OK.

> >> We already have a global sysctl for blocking direct-loading (modules_disabled).
> >
> > My point was that even if you have a CAP_NET_ADMIN check on request_module(),
> > finit_module() will not check for it, so a crafty userspace could still try
> > to just finit_module() directly, and completely then bypass the CAP_NET_ADMIN
> > check.
> 
> The finit_module() uses CAP_SYS_MODULE which should allow all modules
> and in this context it should be more privileged than CAP_NET_ADMIN
> which is only for "netdev-%s" (to not load arbitrary modules with it).
> 
> finit_module() coming from request_module() always has the
> CAP_NET_ADMIN, hence the check is done before.

But since CAP_SYS_MODULE is more restrictive, what's the point in checking
for CAP_NET_ADMIN?

> > So unless I'm missing something, I see no point in adding extra checks for
> > request_module() but nothing for the respective load_module().
> 
> I see, request_module() is called from kernel context which runs in
> init namespace will full capabilities, the spawned userspace modprobe
> will get CAP_SYS_MODULE and all other caps, then after comes modprobe
> and load_module().

Right, so defining the gains of adding this extra check is not very clear
yet. It would seem a benefit exists, what is it?

> Btw as suggested by Linus I will update with request_module_cap() and > I can
> offer my help maintaining these bits too.

Can you start by extending lib/test_module.c and
tools/testing/selftests/kmod/kmod.sh with a proof of concept of the gains here,
as well as ensuring things work as expected ?

  Luis

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 22:18             ` Luis R. Rodriguez
  0 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-28 22:18 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 10:33:27PM +0100, Djalal Harouni wrote:
> On Tue, Nov 28, 2017 at 10:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > On Tue, Nov 28, 2017 at 12:11:34PM -0800, Kees Cook wrote:
> >> On Tue, Nov 28, 2017 at 11:14 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> >> > kmod is just a helper to poke userpsace to load a module, that's it.
> >> >
> >> > The old init_module() and newer finit_module() do the real handy work or
> >> > module loading, and both currently only use may_init_module():
> >> >
> >> > static int may_init_module(void)
> >> > {
> >> >         if (!capable(CAP_SYS_MODULE) || modules_disabled)
> >> >                 return -EPERM;
> >> >
> >> >         return 0;
> >> > }
> >> >
> >> > This begs the question:
> >> >
> >> >   o If userspace just tries to just use raw finit_module() do we want similar
> >> >     checks?
> >> >
> >> > Otherwise, correct me if I'm wrong this all seems pointless.
> >>
> >> Hm? That's direct-loading, not auto-loading. This series is only about
> >> auto-loading.
> >
> > And *all* auto-loading uses aliases? What's the difference between auto-loading
> > and direct-loading?
> 
> Not all auto-loading uses aliases, auto-loading is when kernel code
> calls request_module() to loads the feature that was not present, 

It seems the actual interest here is system call implicated request_module()
calls? Because there are uses of request_module() which may be module hacks,
and not implicated via system calls.

> and direct-loading in this thread is the direct syscalls like
> finit_module().

OK.

> >> We already have a global sysctl for blocking direct-loading (modules_disabled).
> >
> > My point was that even if you have a CAP_NET_ADMIN check on request_module(),
> > finit_module() will not check for it, so a crafty userspace could still try
> > to just finit_module() directly, and completely then bypass the CAP_NET_ADMIN
> > check.
> 
> The finit_module() uses CAP_SYS_MODULE which should allow all modules
> and in this context it should be more privileged than CAP_NET_ADMIN
> which is only for "netdev-%s" (to not load arbitrary modules with it).
> 
> finit_module() coming from request_module() always has the
> CAP_NET_ADMIN, hence the check is done before.

But since CAP_SYS_MODULE is more restrictive, what's the point in checking
for CAP_NET_ADMIN?

> > So unless I'm missing something, I see no point in adding extra checks for
> > request_module() but nothing for the respective load_module().
> 
> I see, request_module() is called from kernel context which runs in
> init namespace will full capabilities, the spawned userspace modprobe
> will get CAP_SYS_MODULE and all other caps, then after comes modprobe
> and load_module().

Right, so defining the gains of adding this extra check is not very clear
yet. It would seem a benefit exists, what is it?

> Btw as suggested by Linus I will update with request_module_cap() and > I can
> offer my help maintaining these bits too.

Can you start by extending lib/test_module.c and
tools/testing/selftests/kmod/kmod.sh with a proof of concept of the gains here,
as well as ensuring things work as expected ?

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 22:18             ` Luis R. Rodriguez
  0 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-28 22:18 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Luis R. Rodriguez, Kees Cook, Andy Lutomirski, Andrew Morton,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, LKML, linux-security-module,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 10:33:27PM +0100, Djalal Harouni wrote:
> On Tue, Nov 28, 2017 at 10:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > On Tue, Nov 28, 2017 at 12:11:34PM -0800, Kees Cook wrote:
> >> On Tue, Nov 28, 2017 at 11:14 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> >> > kmod is just a helper to poke userpsace to load a module, that's it.
> >> >
> >> > The old init_module() and newer finit_module() do the real handy work or
> >> > module loading, and both currently only use may_init_module():
> >> >
> >> > static int may_init_module(void)
> >> > {
> >> >         if (!capable(CAP_SYS_MODULE) || modules_disabled)
> >> >                 return -EPERM;
> >> >
> >> >         return 0;
> >> > }
> >> >
> >> > This begs the question:
> >> >
> >> >   o If userspace just tries to just use raw finit_module() do we want similar
> >> >     checks?
> >> >
> >> > Otherwise, correct me if I'm wrong this all seems pointless.
> >>
> >> Hm? That's direct-loading, not auto-loading. This series is only about
> >> auto-loading.
> >
> > And *all* auto-loading uses aliases? What's the difference between auto-loading
> > and direct-loading?
> 
> Not all auto-loading uses aliases, auto-loading is when kernel code
> calls request_module() to loads the feature that was not present, 

It seems the actual interest here is system call implicated request_module()
calls? Because there are uses of request_module() which may be module hacks,
and not implicated via system calls.

> and direct-loading in this thread is the direct syscalls like
> finit_module().

OK.

> >> We already have a global sysctl for blocking direct-loading (modules_disabled).
> >
> > My point was that even if you have a CAP_NET_ADMIN check on request_module(),
> > finit_module() will not check for it, so a crafty userspace could still try
> > to just finit_module() directly, and completely then bypass the CAP_NET_ADMIN
> > check.
> 
> The finit_module() uses CAP_SYS_MODULE which should allow all modules
> and in this context it should be more privileged than CAP_NET_ADMIN
> which is only for "netdev-%s" (to not load arbitrary modules with it).
> 
> finit_module() coming from request_module() always has the
> CAP_NET_ADMIN, hence the check is done before.

But since CAP_SYS_MODULE is more restrictive, what's the point in checking
for CAP_NET_ADMIN?

> > So unless I'm missing something, I see no point in adding extra checks for
> > request_module() but nothing for the respective load_module().
> 
> I see, request_module() is called from kernel context which runs in
> init namespace will full capabilities, the spawned userspace modprobe
> will get CAP_SYS_MODULE and all other caps, then after comes modprobe
> and load_module().

Right, so defining the gains of adding this extra check is not very clear
yet. It would seem a benefit exists, what is it?

> Btw as suggested by Linus I will update with request_module_cap() and > I can
> offer my help maintaining these bits too.

Can you start by extending lib/test_module.c and
tools/testing/selftests/kmod/kmod.sh with a proof of concept of the gains here,
as well as ensuring things work as expected ?

  Luis

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-28 22:18               ` Kees Cook
  (?)
@ 2017-11-28 22:48                 ` Luis R. Rodriguez
  -1 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-28 22:48 UTC (permalink / raw)
  To: Kees Cook
  Cc: Luis R. Rodriguez, Djalal Harouni, Andy Lutomirski,
	Andrew Morton, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell, LKML,
	linux-security-module, kernel-hardening, Jonathan Corbet,
	Ingo Molnar, David S. Miller, Network Development,
	Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 02:18:18PM -0800, Kees Cook wrote:
> On Tue, Nov 28, 2017 at 2:12 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > On Tue, Nov 28, 2017 at 01:39:58PM -0800, Kees Cook wrote:
> >> On Tue, Nov 28, 2017 at 1:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> >> > And *all* auto-loading uses aliases? What's the difference between auto-loading
> >> > and direct-loading?
> >>
> >> The difference is the process privileges. Unprivilged autoloading
> >> (e.g. int n_hdlc = N_HDLC; ioctl(fd,
> >> TIOCSETD, &n_hdlc)), triggers a privileged call to finit_module()
> >> under CAP_SYS_MODULE.
> >
> > Ah, so system call implicated request_module() calls.
> 
> Yup. Unprivileged user does something that ultimately hits a
> request_module() in the kernel. Then the kernel calls out with the
> usermode helper (which has CAP_SYS_MODULE) and calls finit_module().

Thanks, using this terminology is much better to understand than auto-loading,
given it does make it clear an unprivileged call was one that initiated the
request_module() call, there are many uses of request_module() which *are*
privileged.

> > OK and since CAP_SYS_MODULE is much more restrictive one could argue, what's the
> > point here?
> 
> The goal is to block an unprivileged user from being able to trigger a
> module load without blocking root from loading modules directly.

I see now. Do we have an audit of all system calls which implicate a
request_module() call? Networking is a good example for sure to start
off with but I was curious if we have a grasp of how wide spread this
could be.

I'll go review the patches again now with all this in mind.

  Luis

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 22:48                 ` Luis R. Rodriguez
  0 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-28 22:48 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 02:18:18PM -0800, Kees Cook wrote:
> On Tue, Nov 28, 2017 at 2:12 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > On Tue, Nov 28, 2017 at 01:39:58PM -0800, Kees Cook wrote:
> >> On Tue, Nov 28, 2017 at 1:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> >> > And *all* auto-loading uses aliases? What's the difference between auto-loading
> >> > and direct-loading?
> >>
> >> The difference is the process privileges. Unprivilged autoloading
> >> (e.g. int n_hdlc = N_HDLC; ioctl(fd,
> >> TIOCSETD, &n_hdlc)), triggers a privileged call to finit_module()
> >> under CAP_SYS_MODULE.
> >
> > Ah, so system call implicated request_module() calls.
> 
> Yup. Unprivileged user does something that ultimately hits a
> request_module() in the kernel. Then the kernel calls out with the
> usermode helper (which has CAP_SYS_MODULE) and calls finit_module().

Thanks, using this terminology is much better to understand than auto-loading,
given it does make it clear an unprivileged call was one that initiated the
request_module() call, there are many uses of request_module() which *are*
privileged.

> > OK and since CAP_SYS_MODULE is much more restrictive one could argue, what's the
> > point here?
> 
> The goal is to block an unprivileged user from being able to trigger a
> module load without blocking root from loading modules directly.

I see now. Do we have an audit of all system calls which implicate a
request_module() call? Networking is a good example for sure to start
off with but I was curious if we have a grasp of how wide spread this
could be.

I'll go review the patches again now with all this in mind.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 22:48                 ` Luis R. Rodriguez
  0 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-28 22:48 UTC (permalink / raw)
  To: Kees Cook
  Cc: Luis R. Rodriguez, Djalal Harouni, Andy Lutomirski,
	Andrew Morton, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell, LKML,
	linux-security-module, kernel-hardening, Jonathan Corbet,
	Ingo Molnar, David S. Miller, Network Development,
	Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 02:18:18PM -0800, Kees Cook wrote:
> On Tue, Nov 28, 2017 at 2:12 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > On Tue, Nov 28, 2017 at 01:39:58PM -0800, Kees Cook wrote:
> >> On Tue, Nov 28, 2017 at 1:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> >> > And *all* auto-loading uses aliases? What's the difference between auto-loading
> >> > and direct-loading?
> >>
> >> The difference is the process privileges. Unprivilged autoloading
> >> (e.g. int n_hdlc = N_HDLC; ioctl(fd,
> >> TIOCSETD, &n_hdlc)), triggers a privileged call to finit_module()
> >> under CAP_SYS_MODULE.
> >
> > Ah, so system call implicated request_module() calls.
> 
> Yup. Unprivileged user does something that ultimately hits a
> request_module() in the kernel. Then the kernel calls out with the
> usermode helper (which has CAP_SYS_MODULE) and calls finit_module().

Thanks, using this terminology is much better to understand than auto-loading,
given it does make it clear an unprivileged call was one that initiated the
request_module() call, there are many uses of request_module() which *are*
privileged.

> > OK and since CAP_SYS_MODULE is much more restrictive one could argue, what's the
> > point here?
> 
> The goal is to block an unprivileged user from being able to trigger a
> module load without blocking root from loading modules directly.

I see now. Do we have an audit of all system calls which implicate a
request_module() call? Networking is a good example for sure to start
off with but I was curious if we have a grasp of how wide spread this
could be.

I'll go review the patches again now with all this in mind.

  Luis

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-28 22:18             ` Luis R. Rodriguez
  (?)
@ 2017-11-28 22:52               ` Djalal Harouni
  -1 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-28 22:52 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, James Morris,
	Ben Hutchings, Solar Designer, Serge Hallyn, Jessica Yu,
	Rusty Russell, LKML, linux-security-module, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 11:18 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Tue, Nov 28, 2017 at 10:33:27PM +0100, Djalal Harouni wrote:
>> On Tue, Nov 28, 2017 at 10:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> > On Tue, Nov 28, 2017 at 12:11:34PM -0800, Kees Cook wrote:
>> >> On Tue, Nov 28, 2017 at 11:14 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> >> > kmod is just a helper to poke userpsace to load a module, that's it.
>> >> >
>> >> > The old init_module() and newer finit_module() do the real handy work or
>> >> > module loading, and both currently only use may_init_module():
>> >> >
>> >> > static int may_init_module(void)
>> >> > {
>> >> >         if (!capable(CAP_SYS_MODULE) || modules_disabled)
>> >> >                 return -EPERM;
>> >> >
>> >> >         return 0;
>> >> > }
>> >> >
>> >> > This begs the question:
>> >> >
>> >> >   o If userspace just tries to just use raw finit_module() do we want similar
>> >> >     checks?
>> >> >
>> >> > Otherwise, correct me if I'm wrong this all seems pointless.
>> >>
>> >> Hm? That's direct-loading, not auto-loading. This series is only about
>> >> auto-loading.
>> >
>> > And *all* auto-loading uses aliases? What's the difference between auto-loading
>> > and direct-loading?
>>
>> Not all auto-loading uses aliases, auto-loading is when kernel code
>> calls request_module() to loads the feature that was not present,
>
> It seems the actual interest here is system call implicated request_module()
> calls? Because there are uses of request_module() which may be module hacks,
> and not implicated via system calls.

Indeed.


>> and direct-loading in this thread is the direct syscalls like
>> finit_module().
>
> OK.
>
>> >> We already have a global sysctl for blocking direct-loading (modules_disabled).
>> >
>> > My point was that even if you have a CAP_NET_ADMIN check on request_module(),
>> > finit_module() will not check for it, so a crafty userspace could still try
>> > to just finit_module() directly, and completely then bypass the CAP_NET_ADMIN
>> > check.
>>
>> The finit_module() uses CAP_SYS_MODULE which should allow all modules
>> and in this context it should be more privileged than CAP_NET_ADMIN
>> which is only for "netdev-%s" (to not load arbitrary modules with it).
>>
>> finit_module() coming from request_module() always has the
>> CAP_NET_ADMIN, hence the check is done before.
>
> But since CAP_SYS_MODULE is more restrictive, what's the point in checking
> for CAP_NET_ADMIN?

For backward compatibility with 'netdev' modules since it is for those.


>> > So unless I'm missing something, I see no point in adding extra checks for
>> > request_module() but nothing for the respective load_module().
>>
>> I see, request_module() is called from kernel context which runs in
>> init namespace will full capabilities, the spawned userspace modprobe
>> will get CAP_SYS_MODULE and all other caps, then after comes modprobe
>> and load_module().
>
> Right, so defining the gains of adding this extra check is not very clear
> yet. It would seem a benefit exists, what is it?

it will able to filter if the request_module() should continue loading
the module or deny it which prevents spawning the *privileged*
usermode helper. This is all based on are we allowed to load new
features or not, or IOW I don't want to allow new features or modules
autoloading from now and on, as stated in the cover letter for various
benefit including security, reduce the amount of kernel code running,
but also do not allow new features for anyone like tunneling, etc.


>> Btw as suggested by Linus I will update with request_module_cap() and > I can
>> offer my help maintaining these bits too.
>
> Can you start by extending lib/test_module.c and
> tools/testing/selftests/kmod/kmod.sh with a proof of concept of the gains here,
> as well as ensuring things work as expected ?

Alright Luis, thanks for the hint, yes I will make sure to cover these.

For gains, kees already answered in the other email, and please check
the DCCP exploit and others linked in the cover letter.


Thank you!

>   Luis



-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 22:52               ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-28 22:52 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 11:18 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Tue, Nov 28, 2017 at 10:33:27PM +0100, Djalal Harouni wrote:
>> On Tue, Nov 28, 2017 at 10:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> > On Tue, Nov 28, 2017 at 12:11:34PM -0800, Kees Cook wrote:
>> >> On Tue, Nov 28, 2017 at 11:14 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> >> > kmod is just a helper to poke userpsace to load a module, that's it.
>> >> >
>> >> > The old init_module() and newer finit_module() do the real handy work or
>> >> > module loading, and both currently only use may_init_module():
>> >> >
>> >> > static int may_init_module(void)
>> >> > {
>> >> >         if (!capable(CAP_SYS_MODULE) || modules_disabled)
>> >> >                 return -EPERM;
>> >> >
>> >> >         return 0;
>> >> > }
>> >> >
>> >> > This begs the question:
>> >> >
>> >> >   o If userspace just tries to just use raw finit_module() do we want similar
>> >> >     checks?
>> >> >
>> >> > Otherwise, correct me if I'm wrong this all seems pointless.
>> >>
>> >> Hm? That's direct-loading, not auto-loading. This series is only about
>> >> auto-loading.
>> >
>> > And *all* auto-loading uses aliases? What's the difference between auto-loading
>> > and direct-loading?
>>
>> Not all auto-loading uses aliases, auto-loading is when kernel code
>> calls request_module() to loads the feature that was not present,
>
> It seems the actual interest here is system call implicated request_module()
> calls? Because there are uses of request_module() which may be module hacks,
> and not implicated via system calls.

Indeed.


>> and direct-loading in this thread is the direct syscalls like
>> finit_module().
>
> OK.
>
>> >> We already have a global sysctl for blocking direct-loading (modules_disabled).
>> >
>> > My point was that even if you have a CAP_NET_ADMIN check on request_module(),
>> > finit_module() will not check for it, so a crafty userspace could still try
>> > to just finit_module() directly, and completely then bypass the CAP_NET_ADMIN
>> > check.
>>
>> The finit_module() uses CAP_SYS_MODULE which should allow all modules
>> and in this context it should be more privileged than CAP_NET_ADMIN
>> which is only for "netdev-%s" (to not load arbitrary modules with it).
>>
>> finit_module() coming from request_module() always has the
>> CAP_NET_ADMIN, hence the check is done before.
>
> But since CAP_SYS_MODULE is more restrictive, what's the point in checking
> for CAP_NET_ADMIN?

For backward compatibility with 'netdev' modules since it is for those.


>> > So unless I'm missing something, I see no point in adding extra checks for
>> > request_module() but nothing for the respective load_module().
>>
>> I see, request_module() is called from kernel context which runs in
>> init namespace will full capabilities, the spawned userspace modprobe
>> will get CAP_SYS_MODULE and all other caps, then after comes modprobe
>> and load_module().
>
> Right, so defining the gains of adding this extra check is not very clear
> yet. It would seem a benefit exists, what is it?

it will able to filter if the request_module() should continue loading
the module or deny it which prevents spawning the *privileged*
usermode helper. This is all based on are we allowed to load new
features or not, or IOW I don't want to allow new features or modules
autoloading from now and on, as stated in the cover letter for various
benefit including security, reduce the amount of kernel code running,
but also do not allow new features for anyone like tunneling, etc.


>> Btw as suggested by Linus I will update with request_module_cap() and > I can
>> offer my help maintaining these bits too.
>
> Can you start by extending lib/test_module.c and
> tools/testing/selftests/kmod/kmod.sh with a proof of concept of the gains here,
> as well as ensuring things work as expected ?

Alright Luis, thanks for the hint, yes I will make sure to cover these.

For gains, kees already answered in the other email, and please check
the DCCP exploit and others linked in the cover letter.


Thank you!

>   Luis



-- 
tixxdz
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-28 22:52               ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-28 22:52 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, James Morris,
	Ben Hutchings, Solar Designer, Serge Hallyn, Jessica Yu,
	Rusty Russell, LKML, linux-security-module, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 11:18 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Tue, Nov 28, 2017 at 10:33:27PM +0100, Djalal Harouni wrote:
>> On Tue, Nov 28, 2017 at 10:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> > On Tue, Nov 28, 2017 at 12:11:34PM -0800, Kees Cook wrote:
>> >> On Tue, Nov 28, 2017 at 11:14 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> >> > kmod is just a helper to poke userpsace to load a module, that's it.
>> >> >
>> >> > The old init_module() and newer finit_module() do the real handy work or
>> >> > module loading, and both currently only use may_init_module():
>> >> >
>> >> > static int may_init_module(void)
>> >> > {
>> >> >         if (!capable(CAP_SYS_MODULE) || modules_disabled)
>> >> >                 return -EPERM;
>> >> >
>> >> >         return 0;
>> >> > }
>> >> >
>> >> > This begs the question:
>> >> >
>> >> >   o If userspace just tries to just use raw finit_module() do we want similar
>> >> >     checks?
>> >> >
>> >> > Otherwise, correct me if I'm wrong this all seems pointless.
>> >>
>> >> Hm? That's direct-loading, not auto-loading. This series is only about
>> >> auto-loading.
>> >
>> > And *all* auto-loading uses aliases? What's the difference between auto-loading
>> > and direct-loading?
>>
>> Not all auto-loading uses aliases, auto-loading is when kernel code
>> calls request_module() to loads the feature that was not present,
>
> It seems the actual interest here is system call implicated request_module()
> calls? Because there are uses of request_module() which may be module hacks,
> and not implicated via system calls.

Indeed.


>> and direct-loading in this thread is the direct syscalls like
>> finit_module().
>
> OK.
>
>> >> We already have a global sysctl for blocking direct-loading (modules_disabled).
>> >
>> > My point was that even if you have a CAP_NET_ADMIN check on request_module(),
>> > finit_module() will not check for it, so a crafty userspace could still try
>> > to just finit_module() directly, and completely then bypass the CAP_NET_ADMIN
>> > check.
>>
>> The finit_module() uses CAP_SYS_MODULE which should allow all modules
>> and in this context it should be more privileged than CAP_NET_ADMIN
>> which is only for "netdev-%s" (to not load arbitrary modules with it).
>>
>> finit_module() coming from request_module() always has the
>> CAP_NET_ADMIN, hence the check is done before.
>
> But since CAP_SYS_MODULE is more restrictive, what's the point in checking
> for CAP_NET_ADMIN?

For backward compatibility with 'netdev' modules since it is for those.


>> > So unless I'm missing something, I see no point in adding extra checks for
>> > request_module() but nothing for the respective load_module().
>>
>> I see, request_module() is called from kernel context which runs in
>> init namespace will full capabilities, the spawned userspace modprobe
>> will get CAP_SYS_MODULE and all other caps, then after comes modprobe
>> and load_module().
>
> Right, so defining the gains of adding this extra check is not very clear
> yet. It would seem a benefit exists, what is it?

it will able to filter if the request_module() should continue loading
the module or deny it which prevents spawning the *privileged*
usermode helper. This is all based on are we allowed to load new
features or not, or IOW I don't want to allow new features or modules
autoloading from now and on, as stated in the cover letter for various
benefit including security, reduce the amount of kernel code running,
but also do not allow new features for anyone like tunneling, etc.


>> Btw as suggested by Linus I will update with request_module_cap() and > I can
>> offer my help maintaining these bits too.
>
> Can you start by extending lib/test_module.c and
> tools/testing/selftests/kmod/kmod.sh with a proof of concept of the gains here,
> as well as ensuring things work as expected ?

Alright Luis, thanks for the hint, yes I will make sure to cover these.

For gains, kees already answered in the other email, and please check
the DCCP exploit and others linked in the cover letter.


Thank you!

>   Luis



-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-28 21:33                       ` Kees Cook
  (?)
@ 2017-11-28 23:23                         ` Theodore Ts'o
  -1 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-28 23:23 UTC (permalink / raw)
  To: Kees Cook
  Cc: Linus Torvalds, Djalal Harouni, Jonathan Corbet, James Morris,
	LSM List, Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 01:33:40PM -0800, Kees Cook wrote:
> As I've said before, this isn't a theoretical attack surface. This
> year alone there have been three known-exploitable flaws exposed by
> autoloading:
> 
> The exploit for CVE-2017-2636 uses int n_hdlc = N_HDLC; ioctl(fd,
> TIOCSETD, &n_hdlc) [1]. This is using the existing "tty-ldisc-"
> prefix, and is intentionally unprivileged.
> 
> The exploit for CVE-2017-6074 uses socket(PF_INET6, SOCK_DCCP,
> IPPROTO_IP) [2]. This is using the existing proto prefix, and is
> intentionally unprivileged.

So in these two cases, if the kernel was built w/o modules, and HDLC
and DCCP was built-in, you'd be screwed, then?

Is the goal here to protect people using distro kernels which build
the world as modules, including dodgy pieces of kernel code that are
bug-ridden?

If so, then presumably 90% of the problem you've cited can be done by
creating a script which takes a look of the modules that are normally
in use once the machine is in production, and then deleting everything
else?  Correct?

And yes, this will potentially break some users, but the security
folks who are advocating for the more aggressive version of this
change seem to be OK with breaking users, so they can do this without
making kernel changes.  Good luck getting Red Hat and SuSE to accept
such a change, though....

						- Ted

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 23:23                         ` Theodore Ts'o
  0 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-28 23:23 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 01:33:40PM -0800, Kees Cook wrote:
> As I've said before, this isn't a theoretical attack surface. This
> year alone there have been three known-exploitable flaws exposed by
> autoloading:
> 
> The exploit for CVE-2017-2636 uses int n_hdlc = N_HDLC; ioctl(fd,
> TIOCSETD, &n_hdlc) [1]. This is using the existing "tty-ldisc-"
> prefix, and is intentionally unprivileged.
> 
> The exploit for CVE-2017-6074 uses socket(PF_INET6, SOCK_DCCP,
> IPPROTO_IP) [2]. This is using the existing proto prefix, and is
> intentionally unprivileged.

So in these two cases, if the kernel was built w/o modules, and HDLC
and DCCP was built-in, you'd be screwed, then?

Is the goal here to protect people using distro kernels which build
the world as modules, including dodgy pieces of kernel code that are
bug-ridden?

If so, then presumably 90% of the problem you've cited can be done by
creating a script which takes a look of the modules that are normally
in use once the machine is in production, and then deleting everything
else?  Correct?

And yes, this will potentially break some users, but the security
folks who are advocating for the more aggressive version of this
change seem to be OK with breaking users, so they can do this without
making kernel changes.  Good luck getting Red Hat and SuSE to accept
such a change, though....

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 23:23                         ` Theodore Ts'o
  0 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-28 23:23 UTC (permalink / raw)
  To: Kees Cook
  Cc: Linus Torvalds, Djalal Harouni, Jonathan Corbet, James Morris,
	LSM List, Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 01:33:40PM -0800, Kees Cook wrote:
> As I've said before, this isn't a theoretical attack surface. This
> year alone there have been three known-exploitable flaws exposed by
> autoloading:
> 
> The exploit for CVE-2017-2636 uses int n_hdlc = N_HDLC; ioctl(fd,
> TIOCSETD, &n_hdlc) [1]. This is using the existing "tty-ldisc-"
> prefix, and is intentionally unprivileged.
> 
> The exploit for CVE-2017-6074 uses socket(PF_INET6, SOCK_DCCP,
> IPPROTO_IP) [2]. This is using the existing proto prefix, and is
> intentionally unprivileged.

So in these two cases, if the kernel was built w/o modules, and HDLC
and DCCP was built-in, you'd be screwed, then?

Is the goal here to protect people using distro kernels which build
the world as modules, including dodgy pieces of kernel code that are
bug-ridden?

If so, then presumably 90% of the problem you've cited can be done by
creating a script which takes a look of the modules that are normally
in use once the machine is in production, and then deleting everything
else?  Correct?

And yes, this will potentially break some users, but the security
folks who are advocating for the more aggressive version of this
change seem to be OK with breaking users, so they can do this without
making kernel changes.  Good luck getting Red Hat and SuSE to accept
such a change, though....

						- Ted

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-28 23:23                         ` Theodore Ts'o
  (?)
@ 2017-11-28 23:29                           ` Kees Cook
  -1 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 23:29 UTC (permalink / raw)
  To: Theodore Ts'o, Kees Cook, Linus Torvalds, Djalal Harouni,
	Jonathan Corbet, James Morris, LSM List,
	Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 3:23 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Tue, Nov 28, 2017 at 01:33:40PM -0800, Kees Cook wrote:
>> As I've said before, this isn't a theoretical attack surface. This
>> year alone there have been three known-exploitable flaws exposed by
>> autoloading:
>>
>> The exploit for CVE-2017-2636 uses int n_hdlc = N_HDLC; ioctl(fd,
>> TIOCSETD, &n_hdlc) [1]. This is using the existing "tty-ldisc-"
>> prefix, and is intentionally unprivileged.
>>
>> The exploit for CVE-2017-6074 uses socket(PF_INET6, SOCK_DCCP,
>> IPPROTO_IP) [2]. This is using the existing proto prefix, and is
>> intentionally unprivileged.
>
> So in these two cases, if the kernel was built w/o modules, and HDLC
> and DCCP was built-in, you'd be screwed, then?

Sure, but that's not the common situation.

> Is the goal here to protect people using distro kernels which build
> the world as modules, including dodgy pieces of kernel code that are
> bug-ridden?

The bulk of the risk comes from distro kernels, yes. (Though "bug
ridden" is a strong statement. There are and will be bugs, scattered
across a wide portion of the kernel, it's just that modules tend to
cover most of that attack surface.)

> If so, then presumably 90% of the problem you've cited can be done by
> creating a script which takes a look of the modules that are normally
> in use once the machine is in production, and then deleting everything
> else?  Correct?

This isn't always obvious nor is it easy: a kernel upgrade will
reinstall everything, etc. This is effectively a blacklist, and we
need an easy whitelist approach. And an admin may not want to make it
so hard for themselves to load some special filesystem module that
they have to hunt it down and reinstall it. Or they'll want some
containers to be able to trigger module loading and others not to.
They just want to know their users aren't triggering module loading
that expands the kernel attack surface.

> And yes, this will potentially break some users, but the security
> folks who are advocating for the more aggressive version of this
> change seem to be OK with breaking users, so they can do this without
> making kernel changes.  Good luck getting Red Hat and SuSE to accept
> such a change, though....

We should strive to make the kernel easy to protect. Expecting admins
to jump through these hoops isn't a sensible way forward.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 23:29                           ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 23:29 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 3:23 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Tue, Nov 28, 2017 at 01:33:40PM -0800, Kees Cook wrote:
>> As I've said before, this isn't a theoretical attack surface. This
>> year alone there have been three known-exploitable flaws exposed by
>> autoloading:
>>
>> The exploit for CVE-2017-2636 uses int n_hdlc = N_HDLC; ioctl(fd,
>> TIOCSETD, &n_hdlc) [1]. This is using the existing "tty-ldisc-"
>> prefix, and is intentionally unprivileged.
>>
>> The exploit for CVE-2017-6074 uses socket(PF_INET6, SOCK_DCCP,
>> IPPROTO_IP) [2]. This is using the existing proto prefix, and is
>> intentionally unprivileged.
>
> So in these two cases, if the kernel was built w/o modules, and HDLC
> and DCCP was built-in, you'd be screwed, then?

Sure, but that's not the common situation.

> Is the goal here to protect people using distro kernels which build
> the world as modules, including dodgy pieces of kernel code that are
> bug-ridden?

The bulk of the risk comes from distro kernels, yes. (Though "bug
ridden" is a strong statement. There are and will be bugs, scattered
across a wide portion of the kernel, it's just that modules tend to
cover most of that attack surface.)

> If so, then presumably 90% of the problem you've cited can be done by
> creating a script which takes a look of the modules that are normally
> in use once the machine is in production, and then deleting everything
> else?  Correct?

This isn't always obvious nor is it easy: a kernel upgrade will
reinstall everything, etc. This is effectively a blacklist, and we
need an easy whitelist approach. And an admin may not want to make it
so hard for themselves to load some special filesystem module that
they have to hunt it down and reinstall it. Or they'll want some
containers to be able to trigger module loading and others not to.
They just want to know their users aren't triggering module loading
that expands the kernel attack surface.

> And yes, this will potentially break some users, but the security
> folks who are advocating for the more aggressive version of this
> change seem to be OK with breaking users, so they can do this without
> making kernel changes.  Good luck getting Red Hat and SuSE to accept
> such a change, though....

We should strive to make the kernel easy to protect. Expecting admins
to jump through these hoops isn't a sensible way forward.

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 23:29                           ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-28 23:29 UTC (permalink / raw)
  To: Theodore Ts'o, Kees Cook, Linus Torvalds, Djalal Harouni,
	Jonathan Corbet, James Morris, LSM List,
	Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 3:23 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Tue, Nov 28, 2017 at 01:33:40PM -0800, Kees Cook wrote:
>> As I've said before, this isn't a theoretical attack surface. This
>> year alone there have been three known-exploitable flaws exposed by
>> autoloading:
>>
>> The exploit for CVE-2017-2636 uses int n_hdlc = N_HDLC; ioctl(fd,
>> TIOCSETD, &n_hdlc) [1]. This is using the existing "tty-ldisc-"
>> prefix, and is intentionally unprivileged.
>>
>> The exploit for CVE-2017-6074 uses socket(PF_INET6, SOCK_DCCP,
>> IPPROTO_IP) [2]. This is using the existing proto prefix, and is
>> intentionally unprivileged.
>
> So in these two cases, if the kernel was built w/o modules, and HDLC
> and DCCP was built-in, you'd be screwed, then?

Sure, but that's not the common situation.

> Is the goal here to protect people using distro kernels which build
> the world as modules, including dodgy pieces of kernel code that are
> bug-ridden?

The bulk of the risk comes from distro kernels, yes. (Though "bug
ridden" is a strong statement. There are and will be bugs, scattered
across a wide portion of the kernel, it's just that modules tend to
cover most of that attack surface.)

> If so, then presumably 90% of the problem you've cited can be done by
> creating a script which takes a look of the modules that are normally
> in use once the machine is in production, and then deleting everything
> else?  Correct?

This isn't always obvious nor is it easy: a kernel upgrade will
reinstall everything, etc. This is effectively a blacklist, and we
need an easy whitelist approach. And an admin may not want to make it
so hard for themselves to load some special filesystem module that
they have to hunt it down and reinstall it. Or they'll want some
containers to be able to trigger module loading and others not to.
They just want to know their users aren't triggering module loading
that expands the kernel attack surface.

> And yes, this will potentially break some users, but the security
> folks who are advocating for the more aggressive version of this
> change seem to be OK with breaking users, so they can do this without
> making kernel changes.  Good luck getting Red Hat and SuSE to accept
> such a change, though....

We should strive to make the kernel easy to protect. Expecting admins
to jump through these hoops isn't a sensible way forward.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-28 23:29                           ` Kees Cook
  (?)
@ 2017-11-28 23:49                             ` Theodore Ts'o
  -1 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-28 23:49 UTC (permalink / raw)
  To: Kees Cook
  Cc: Linus Torvalds, Djalal Harouni, Jonathan Corbet, James Morris,
	LSM List, Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 03:29:01PM -0800, Kees Cook wrote:
> > So in these two cases, if the kernel was built w/o modules, and HDLC
> > and DCCP was built-in, you'd be screwed, then?
> 
> Sure, but that's not the common situation.
> 
> > Is the goal here to protect people using distro kernels which build
> > the world as modules, including dodgy pieces of kernel code that are
> > bug-ridden?
> 
> The bulk of the risk comes from distro kernels, yes. (Though "bug
> ridden" is a strong statement. There are and will be bugs, scattered
> across a wide portion of the kernel, it's just that modules tend to
> cover most of that attack surface.)

OK, but if the goal is to protect users who are running distro
kernels, then a kernel config that breaks some percentage of users is
****highly**** unlikely to be enabled by Red Hat and SuSE, right?  And
many of these users either can't (from a skills perspective) or won't
(because they lose all support from the enterprise distro's help desk)
recompile their kernel to enable an "might break 3% of all users ---
oh, well" config option.

Which argues for being extremely conservative about making something
that has an extremely low probability of breaking users, and it points
out why Geo Kozey's "who cares about breaking users; security is
IMPORTANT" argument is so wrong-headed.

If the goal is to protect distro kernels, but a sloppy approach
guarantees that distro kernels will never enable it, is it really
worth it?

	   	       	       	    	  	 - Ted

P.S.  This is where it might be useful to get some input from the Red
Hat and SuSE support teams.  How many angry user calls to their help
desk are they willing to field before they'll just turn off the kernel
config option for their kernels?

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 23:49                             ` Theodore Ts'o
  0 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-28 23:49 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 03:29:01PM -0800, Kees Cook wrote:
> > So in these two cases, if the kernel was built w/o modules, and HDLC
> > and DCCP was built-in, you'd be screwed, then?
> 
> Sure, but that's not the common situation.
> 
> > Is the goal here to protect people using distro kernels which build
> > the world as modules, including dodgy pieces of kernel code that are
> > bug-ridden?
> 
> The bulk of the risk comes from distro kernels, yes. (Though "bug
> ridden" is a strong statement. There are and will be bugs, scattered
> across a wide portion of the kernel, it's just that modules tend to
> cover most of that attack surface.)

OK, but if the goal is to protect users who are running distro
kernels, then a kernel config that breaks some percentage of users is
****highly**** unlikely to be enabled by Red Hat and SuSE, right?  And
many of these users either can't (from a skills perspective) or won't
(because they lose all support from the enterprise distro's help desk)
recompile their kernel to enable an "might break 3% of all users ---
oh, well" config option.

Which argues for being extremely conservative about making something
that has an extremely low probability of breaking users, and it points
out why Geo Kozey's "who cares about breaking users; security is
IMPORTANT" argument is so wrong-headed.

If the goal is to protect distro kernels, but a sloppy approach
guarantees that distro kernels will never enable it, is it really
worth it?

	   	       	       	    	  	 - Ted

P.S.  This is where it might be useful to get some input from the Red
Hat and SuSE support teams.  How many angry user calls to their help
desk are they willing to field before they'll just turn off the kernel
config option for their kernels?
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 23:49                             ` Theodore Ts'o
  0 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-28 23:49 UTC (permalink / raw)
  To: Kees Cook
  Cc: Linus Torvalds, Djalal Harouni, Jonathan Corbet, James Morris,
	LSM List, Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 03:29:01PM -0800, Kees Cook wrote:
> > So in these two cases, if the kernel was built w/o modules, and HDLC
> > and DCCP was built-in, you'd be screwed, then?
> 
> Sure, but that's not the common situation.
> 
> > Is the goal here to protect people using distro kernels which build
> > the world as modules, including dodgy pieces of kernel code that are
> > bug-ridden?
> 
> The bulk of the risk comes from distro kernels, yes. (Though "bug
> ridden" is a strong statement. There are and will be bugs, scattered
> across a wide portion of the kernel, it's just that modules tend to
> cover most of that attack surface.)

OK, but if the goal is to protect users who are running distro
kernels, then a kernel config that breaks some percentage of users is
****highly**** unlikely to be enabled by Red Hat and SuSE, right?  And
many of these users either can't (from a skills perspective) or won't
(because they lose all support from the enterprise distro's help desk)
recompile their kernel to enable an "might break 3% of all users ---
oh, well" config option.

Which argues for being extremely conservative about making something
that has an extremely low probability of breaking users, and it points
out why Geo Kozey's "who cares about breaking users; security is
IMPORTANT" argument is so wrong-headed.

If the goal is to protect distro kernels, but a sloppy approach
guarantees that distro kernels will never enable it, is it really
worth it?

	   	       	       	    	  	 - Ted

P.S.  This is where it might be useful to get some input from the Red
Hat and SuSE support teams.  How many angry user calls to their help
desk are they willing to field before they'll just turn off the kernel
config option for their kernels?

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-28 21:51                       ` Geo Kozey
  (?)
@ 2017-11-28 23:51                         ` Linus Torvalds
  -1 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-28 23:51 UTC (permalink / raw)
  To: Geo Kozey; +Cc: LSM List, Linux Kernel Mailing List, kernel-hardening

On Tue, Nov 28, 2017 at 1:51 PM, Geo Kozey <geokozey@mailfence.com> wrote:
>
> What about "we're insecure by default but you can't do anything to change this"? It describes current situation.

Go away, and don't send me patches until you have dug your head out of
whatever hole you have put it in..

If this is the kind of shit-headed responses I get from the
"hardening" list, then I don't want to have anything to do with you
guys.

Seriously.

I sent out a long explanation of what's wrong with the hardening
people last week. It made the news. If you still don't understand,
you're simply not worth working with.

If you cannot help improve kernel security for the default case, and
you can't even be bothered to try, and only want to fix some special
case that doesn't then improve anything at all for most people, I
really _really_ suggest you go play in your own sandbox.

Because clearly, if you're not interested in improving things for
anybody else, why the hell should you care about the upstream kernel
anyway?

That's what this boils down to: if you send me patches, you had better
strive to improve security for everybody, not just for some little
locked-down special case.

We're not grsecurity. We never have been. We're not interested in the
crazy people. We're interested in the kind of security that is
generally applicable.

To the mainline kernel, not breaking existing users matters, but it
also matters that the patches make sense for everybody, because
otherwise, why be mainline?

So a patch that avoids breaking existing users, but also doesn't
actually improve anything for existing users, simply shouldn't be part
of the mainline kernel.

Comprende?

                    Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 23:51                         ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-28 23:51 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 1:51 PM, Geo Kozey <geokozey@mailfence.com> wrote:
>
> What about "we're insecure by default but you can't do anything to change this"? It describes current situation.

Go away, and don't send me patches until you have dug your head out of
whatever hole you have put it in..

If this is the kind of shit-headed responses I get from the
"hardening" list, then I don't want to have anything to do with you
guys.

Seriously.

I sent out a long explanation of what's wrong with the hardening
people last week. It made the news. If you still don't understand,
you're simply not worth working with.

If you cannot help improve kernel security for the default case, and
you can't even be bothered to try, and only want to fix some special
case that doesn't then improve anything at all for most people, I
really _really_ suggest you go play in your own sandbox.

Because clearly, if you're not interested in improving things for
anybody else, why the hell should you care about the upstream kernel
anyway?

That's what this boils down to: if you send me patches, you had better
strive to improve security for everybody, not just for some little
locked-down special case.

We're not grsecurity. We never have been. We're not interested in the
crazy people. We're interested in the kind of security that is
generally applicable.

To the mainline kernel, not breaking existing users matters, but it
also matters that the patches make sense for everybody, because
otherwise, why be mainline?

So a patch that avoids breaking existing users, but also doesn't
actually improve anything for existing users, simply shouldn't be part
of the mainline kernel.

Comprende?

                    Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 23:51                         ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-28 23:51 UTC (permalink / raw)
  To: Geo Kozey; +Cc: LSM List, Linux Kernel Mailing List, kernel-hardening

On Tue, Nov 28, 2017 at 1:51 PM, Geo Kozey <geokozey@mailfence.com> wrote:
>
> What about "we're insecure by default but you can't do anything to change this"? It describes current situation.

Go away, and don't send me patches until you have dug your head out of
whatever hole you have put it in..

If this is the kind of shit-headed responses I get from the
"hardening" list, then I don't want to have anything to do with you
guys.

Seriously.

I sent out a long explanation of what's wrong with the hardening
people last week. It made the news. If you still don't understand,
you're simply not worth working with.

If you cannot help improve kernel security for the default case, and
you can't even be bothered to try, and only want to fix some special
case that doesn't then improve anything at all for most people, I
really _really_ suggest you go play in your own sandbox.

Because clearly, if you're not interested in improving things for
anybody else, why the hell should you care about the upstream kernel
anyway?

That's what this boils down to: if you send me patches, you had better
strive to improve security for everybody, not just for some little
locked-down special case.

We're not grsecurity. We never have been. We're not interested in the
crazy people. We're interested in the kind of security that is
generally applicable.

To the mainline kernel, not breaking existing users matters, but it
also matters that the patches make sense for everybody, because
otherwise, why be mainline?

So a patch that avoids breaking existing users, but also doesn't
actually improve anything for existing users, simply shouldn't be part
of the mainline kernel.

Comprende?

                    Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-28 23:23                         ` Theodore Ts'o
  (?)
@ 2017-11-28 23:53                           ` Djalal Harouni
  -1 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-28 23:53 UTC (permalink / raw)
  To: Theodore Ts'o, Kees Cook, Linus Torvalds, Djalal Harouni,
	Jonathan Corbet, James Morris, LSM List,
	Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Wed, Nov 29, 2017 at 12:23 AM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Tue, Nov 28, 2017 at 01:33:40PM -0800, Kees Cook wrote:
>> As I've said before, this isn't a theoretical attack surface. This
>> year alone there have been three known-exploitable flaws exposed by
>> autoloading:
>>
>> The exploit for CVE-2017-2636 uses int n_hdlc = N_HDLC; ioctl(fd,
>> TIOCSETD, &n_hdlc) [1]. This is using the existing "tty-ldisc-"
>> prefix, and is intentionally unprivileged.
>>
>> The exploit for CVE-2017-6074 uses socket(PF_INET6, SOCK_DCCP,
>> IPPROTO_IP) [2]. This is using the existing proto prefix, and is
>> intentionally unprivileged.
>
> So in these two cases, if the kernel was built w/o modules, and HDLC
> and DCCP was built-in, you'd be screwed, then?
>
> Is the goal here to protect people using distro kernels which build
> the world as modules, including dodgy pieces of kernel code that are
> bug-ridden?
>
> If so, then presumably 90% of the problem you've cited can be done by
> creating a script which takes a look of the modules that are normally
> in use once the machine is in production, and then deleting everything
> else?  Correct?
>
> And yes, this will potentially break some users, but the security
> folks who are advocating for the more aggressive version of this
> change seem to be OK with breaking users, so they can do this without
> making kernel changes.  Good luck getting Red Hat and SuSE to accept
> such a change, though....

The patches does not change default and make it easy for users and we
have request for this, not all world is Red Hat / SuSE , I build
embedded Linux for clients when I manage to have some, and I clearly
would have set this to my clients since most of them won't be able to
afford all the signing and complexity, now how I should allow modules
load/unload replace with newer versions, but restrict some of their
apps from triggering it, "modules_disabled=1" is not practical. I
can't build a perfect version for every usecase,  and they started to
ship apps for IoT, even containers for IoT, yes they do it and they
use the same os for various use cases! so it is not about those, even
embedded vendors have one single shared layer that they use for all
their products.

For distros, the target is also containers and sandboxes, and they are
already interested in it.

P.S. please the cover letter already mentions that this is for Embedded and IoT.

>                                                 - Ted



-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 23:53                           ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-28 23:53 UTC (permalink / raw)
  To: linux-security-module

On Wed, Nov 29, 2017 at 12:23 AM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Tue, Nov 28, 2017 at 01:33:40PM -0800, Kees Cook wrote:
>> As I've said before, this isn't a theoretical attack surface. This
>> year alone there have been three known-exploitable flaws exposed by
>> autoloading:
>>
>> The exploit for CVE-2017-2636 uses int n_hdlc = N_HDLC; ioctl(fd,
>> TIOCSETD, &n_hdlc) [1]. This is using the existing "tty-ldisc-"
>> prefix, and is intentionally unprivileged.
>>
>> The exploit for CVE-2017-6074 uses socket(PF_INET6, SOCK_DCCP,
>> IPPROTO_IP) [2]. This is using the existing proto prefix, and is
>> intentionally unprivileged.
>
> So in these two cases, if the kernel was built w/o modules, and HDLC
> and DCCP was built-in, you'd be screwed, then?
>
> Is the goal here to protect people using distro kernels which build
> the world as modules, including dodgy pieces of kernel code that are
> bug-ridden?
>
> If so, then presumably 90% of the problem you've cited can be done by
> creating a script which takes a look of the modules that are normally
> in use once the machine is in production, and then deleting everything
> else?  Correct?
>
> And yes, this will potentially break some users, but the security
> folks who are advocating for the more aggressive version of this
> change seem to be OK with breaking users, so they can do this without
> making kernel changes.  Good luck getting Red Hat and SuSE to accept
> such a change, though....

The patches does not change default and make it easy for users and we
have request for this, not all world is Red Hat / SuSE , I build
embedded Linux for clients when I manage to have some, and I clearly
would have set this to my clients since most of them won't be able to
afford all the signing and complexity, now how I should allow modules
load/unload replace with newer versions, but restrict some of their
apps from triggering it, "modules_disabled=1" is not practical. I
can't build a perfect version for every usecase,  and they started to
ship apps for IoT, even containers for IoT, yes they do it and they
use the same os for various use cases! so it is not about those, even
embedded vendors have one single shared layer that they use for all
their products.

For distros, the target is also containers and sandboxes, and they are
already interested in it.

P.S. please the cover letter already mentions that this is for Embedded and IoT.

>                                                 - Ted



-- 
tixxdz
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-28 23:53                           ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-28 23:53 UTC (permalink / raw)
  To: Theodore Ts'o, Kees Cook, Linus Torvalds, Djalal Harouni,
	Jonathan Corbet, James Morris, LSM List,
	Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Wed, Nov 29, 2017 at 12:23 AM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Tue, Nov 28, 2017 at 01:33:40PM -0800, Kees Cook wrote:
>> As I've said before, this isn't a theoretical attack surface. This
>> year alone there have been three known-exploitable flaws exposed by
>> autoloading:
>>
>> The exploit for CVE-2017-2636 uses int n_hdlc = N_HDLC; ioctl(fd,
>> TIOCSETD, &n_hdlc) [1]. This is using the existing "tty-ldisc-"
>> prefix, and is intentionally unprivileged.
>>
>> The exploit for CVE-2017-6074 uses socket(PF_INET6, SOCK_DCCP,
>> IPPROTO_IP) [2]. This is using the existing proto prefix, and is
>> intentionally unprivileged.
>
> So in these two cases, if the kernel was built w/o modules, and HDLC
> and DCCP was built-in, you'd be screwed, then?
>
> Is the goal here to protect people using distro kernels which build
> the world as modules, including dodgy pieces of kernel code that are
> bug-ridden?
>
> If so, then presumably 90% of the problem you've cited can be done by
> creating a script which takes a look of the modules that are normally
> in use once the machine is in production, and then deleting everything
> else?  Correct?
>
> And yes, this will potentially break some users, but the security
> folks who are advocating for the more aggressive version of this
> change seem to be OK with breaking users, so they can do this without
> making kernel changes.  Good luck getting Red Hat and SuSE to accept
> such a change, though....

The patches does not change default and make it easy for users and we
have request for this, not all world is Red Hat / SuSE , I build
embedded Linux for clients when I manage to have some, and I clearly
would have set this to my clients since most of them won't be able to
afford all the signing and complexity, now how I should allow modules
load/unload replace with newer versions, but restrict some of their
apps from triggering it, "modules_disabled=1" is not practical. I
can't build a perfect version for every usecase,  and they started to
ship apps for IoT, even containers for IoT, yes they do it and they
use the same os for various use cases! so it is not about those, even
embedded vendors have one single shared layer that they use for all
their products.

For distros, the target is also containers and sandboxes, and they are
already interested in it.

P.S. please the cover letter already mentions that this is for Embedded and IoT.

>                                                 - Ted



-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-28 23:51                         ` Linus Torvalds
  (?)
@ 2017-11-29  0:17                           ` Linus Torvalds
  -1 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29  0:17 UTC (permalink / raw)
  To: Geo Kozey; +Cc: LSM List, Linux Kernel Mailing List, kernel-hardening

On Tue, Nov 28, 2017 at 3:51 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So a patch that avoids breaking existing users, but also doesn't
> actually improve anything for existing users, simply shouldn't be part
> of the mainline kernel.

Just to clarify: maybe it ends up being truly impossible to make the
default be more restrictive. The default certainly won't be the most
restrictive option.

But at the same time, if people can't even be bothered to try to
improve the general case, and only do things that you have to opt in
for, it really isn't much of an improvement. We had this whole
"opt-in" discussion for another thread entirely, and it basically
didn't improve anything for anybody for the better half of a decade.

Hardening that only works for special cases isn't hardening at all. It
will just mean that 99+% of all kernel developers won't see the
fallout at all.

Yes, something like Android may be 99% of the devices, but it's a very
small portion of the core developer base because the hardware is all
locked down, and it's an even smaller portion of the usage patterns.

So I can see some people say "We can use this for android and protect
the 99%". But if it then is basically invisible to the rest of the
user base, it means that all those servers etc aren't getting the kind
of protection they should have.

Just to take that DCCP thing as an example: being a module wasn't what
made it vulnerable. It would have been vulnerable compiled in too.
What made it vulnerable was that the DCCP code simply isn't widely
enough used and tested, and basically barely on life support. And it
was available much too widely despite that.

So all this is about is to make for a smaller attack surface.

But if it turns out that we can make the attack surface smaller by
simply white-listing a few modules that we know are actively used and
feel better about the quality of, that makes for a much smaller attack
surface _too_. And it does so in general, without having to set some
flag that 99% of all MIS people won't even really know about.

So that's why I want people to look at a different approach. Yes, the
opt-in model means that by default nothing changes. That protects
against the whole "oops, we don't break user space". But it has a huge
downside.

The model that I am a proponent of is to take a softer approach
initially: don't forbid module loading (because that breaks users),
but instead _warn_ about non-root module loading. And then we can
start fixing the cases that we find.

See? This is *exactly* the same thing that the user-mode access thing
was about. Hardening people need to get over their "hard rules"
mindset. We don't kill processes or forbid them from doing things that
might be bad. We start by warning about them, to see what "might be
bad" cases are actually normal, and not actually bad at all. And then
we use that information to guide our notion of what should actually
trigger a stronger response.

                   Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29  0:17                           ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29  0:17 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 3:51 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So a patch that avoids breaking existing users, but also doesn't
> actually improve anything for existing users, simply shouldn't be part
> of the mainline kernel.

Just to clarify: maybe it ends up being truly impossible to make the
default be more restrictive. The default certainly won't be the most
restrictive option.

But at the same time, if people can't even be bothered to try to
improve the general case, and only do things that you have to opt in
for, it really isn't much of an improvement. We had this whole
"opt-in" discussion for another thread entirely, and it basically
didn't improve anything for anybody for the better half of a decade.

Hardening that only works for special cases isn't hardening at all. It
will just mean that 99+% of all kernel developers won't see the
fallout at all.

Yes, something like Android may be 99% of the devices, but it's a very
small portion of the core developer base because the hardware is all
locked down, and it's an even smaller portion of the usage patterns.

So I can see some people say "We can use this for android and protect
the 99%". But if it then is basically invisible to the rest of the
user base, it means that all those servers etc aren't getting the kind
of protection they should have.

Just to take that DCCP thing as an example: being a module wasn't what
made it vulnerable. It would have been vulnerable compiled in too.
What made it vulnerable was that the DCCP code simply isn't widely
enough used and tested, and basically barely on life support. And it
was available much too widely despite that.

So all this is about is to make for a smaller attack surface.

But if it turns out that we can make the attack surface smaller by
simply white-listing a few modules that we know are actively used and
feel better about the quality of, that makes for a much smaller attack
surface _too_. And it does so in general, without having to set some
flag that 99% of all MIS people won't even really know about.

So that's why I want people to look at a different approach. Yes, the
opt-in model means that by default nothing changes. That protects
against the whole "oops, we don't break user space". But it has a huge
downside.

The model that I am a proponent of is to take a softer approach
initially: don't forbid module loading (because that breaks users),
but instead _warn_ about non-root module loading. And then we can
start fixing the cases that we find.

See? This is *exactly* the same thing that the user-mode access thing
was about. Hardening people need to get over their "hard rules"
mindset. We don't kill processes or forbid them from doing things that
might be bad. We start by warning about them, to see what "might be
bad" cases are actually normal, and not actually bad at all. And then
we use that information to guide our notion of what should actually
trigger a stronger response.

                   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29  0:17                           ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29  0:17 UTC (permalink / raw)
  To: Geo Kozey; +Cc: LSM List, Linux Kernel Mailing List, kernel-hardening

On Tue, Nov 28, 2017 at 3:51 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So a patch that avoids breaking existing users, but also doesn't
> actually improve anything for existing users, simply shouldn't be part
> of the mainline kernel.

Just to clarify: maybe it ends up being truly impossible to make the
default be more restrictive. The default certainly won't be the most
restrictive option.

But at the same time, if people can't even be bothered to try to
improve the general case, and only do things that you have to opt in
for, it really isn't much of an improvement. We had this whole
"opt-in" discussion for another thread entirely, and it basically
didn't improve anything for anybody for the better half of a decade.

Hardening that only works for special cases isn't hardening at all. It
will just mean that 99+% of all kernel developers won't see the
fallout at all.

Yes, something like Android may be 99% of the devices, but it's a very
small portion of the core developer base because the hardware is all
locked down, and it's an even smaller portion of the usage patterns.

So I can see some people say "We can use this for android and protect
the 99%". But if it then is basically invisible to the rest of the
user base, it means that all those servers etc aren't getting the kind
of protection they should have.

Just to take that DCCP thing as an example: being a module wasn't what
made it vulnerable. It would have been vulnerable compiled in too.
What made it vulnerable was that the DCCP code simply isn't widely
enough used and tested, and basically barely on life support. And it
was available much too widely despite that.

So all this is about is to make for a smaller attack surface.

But if it turns out that we can make the attack surface smaller by
simply white-listing a few modules that we know are actively used and
feel better about the quality of, that makes for a much smaller attack
surface _too_. And it does so in general, without having to set some
flag that 99% of all MIS people won't even really know about.

So that's why I want people to look at a different approach. Yes, the
opt-in model means that by default nothing changes. That protects
against the whole "oops, we don't break user space". But it has a huge
downside.

The model that I am a proponent of is to take a softer approach
initially: don't forbid module loading (because that breaks users),
but instead _warn_ about non-root module loading. And then we can
start fixing the cases that we find.

See? This is *exactly* the same thing that the user-mode access thing
was about. Hardening people need to get over their "hard rules"
mindset. We don't kill processes or forbid them from doing things that
might be bad. We start by warning about them, to see what "might be
bad" cases are actually normal, and not actually bad at all. And then
we use that information to guide our notion of what should actually
trigger a stronger response.

                   Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-28 23:49                             ` Theodore Ts'o
  (?)
@ 2017-11-29  0:18                               ` Kees Cook
  -1 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-29  0:18 UTC (permalink / raw)
  To: Theodore Ts'o, Kees Cook, Linus Torvalds, Djalal Harouni,
	Jonathan Corbet, James Morris, LSM List,
	Linux Kernel Mailing List, kernel-hardening, Geo Kozey,
	Tycho Andersen

On Tue, Nov 28, 2017 at 3:49 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> OK, but if the goal is to protect users who are running distro
> kernels, then a kernel config that breaks some percentage of users is
> ****highly**** unlikely to be enabled by Red Hat and SuSE, right?  And
> many of these users either can't (from a skills perspective) or won't
> (because they lose all support from the enterprise distro's help desk)
> recompile their kernel to enable an "might break 3% of all users ---
> oh, well" config option.

There's a spectrum across "insane not to be done everywhere"
(STRICT_KERNEL_RWX), "this is a good idea for nearly all Linux
systems" (link restrictions), "this can break some common use-cases,
but protects systems without that use-case" (user-namespace
disabling), "this breaks most systems, but specialized deployments
really need it" (modules_disabled).

There's also a difference between immutable CONFIG options that cannot
be disabled at runtime, those that can, global sysctls, per-namespace
controls, etc etc. The kernel is all about providing admins with knobs
to tweak their performance and security. Suddenly being told that we
can't create optional improvements is very odd.

Now, being told "make it easier to audit all the module loading we're
already doing so we can further reduce needless exposures for everyone
even without this feature enabled", THAT makes sense. My point there
is that we've already done those kinds of things; see commits like:

  7f78e0351394 ("fs: Limit sys_mount to only request filesystem modules.")
  5d26a105b5a7 ("crypto: prefix module autoloading with "crypto-"")
  4943ba16bbc2 ("crypto: include crypto- module prefix in template")

> Which argues for being extremely conservative about making something
> that has an extremely low probability of breaking users, and it points
> out why Geo Kozey's "who cares about breaking users; security is
> IMPORTANT" argument is so wrong-headed.

I don't want to break users either. I want to provide meaningful ways
for admins to reduce their kernel attack surface. Djalal and I aren't
advocating for on-by-default removal of module auto-loading (that
would have been a very small patch!). The idea was to provide a
dynamic control over it, and make that available to distros. As shown
in the patch series, it would be immediately put to use with systemd
for process-tree isolation and for container restriction.

> If the goal is to protect distro kernels, but a sloppy approach
> guarantees that distro kernels will never enable it, is it really
> worth it?

I don't think this is sloppy and of course distros would see use,
since systemd would be using it.

> P.S.  This is where it might be useful to get some input from the Red
> Hat and SuSE support teams.  How many angry user calls to their help
> desk are they willing to field before they'll just turn off the kernel
> config option for their kernels?

This isn't about giant-hammer CONFIGs -- this is like more like
PR_SET_DUMPABLE or seccomp.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29  0:18                               ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-29  0:18 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 3:49 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> OK, but if the goal is to protect users who are running distro
> kernels, then a kernel config that breaks some percentage of users is
> ****highly**** unlikely to be enabled by Red Hat and SuSE, right?  And
> many of these users either can't (from a skills perspective) or won't
> (because they lose all support from the enterprise distro's help desk)
> recompile their kernel to enable an "might break 3% of all users ---
> oh, well" config option.

There's a spectrum across "insane not to be done everywhere"
(STRICT_KERNEL_RWX), "this is a good idea for nearly all Linux
systems" (link restrictions), "this can break some common use-cases,
but protects systems without that use-case" (user-namespace
disabling), "this breaks most systems, but specialized deployments
really need it" (modules_disabled).

There's also a difference between immutable CONFIG options that cannot
be disabled at runtime, those that can, global sysctls, per-namespace
controls, etc etc. The kernel is all about providing admins with knobs
to tweak their performance and security. Suddenly being told that we
can't create optional improvements is very odd.

Now, being told "make it easier to audit all the module loading we're
already doing so we can further reduce needless exposures for everyone
even without this feature enabled", THAT makes sense. My point there
is that we've already done those kinds of things; see commits like:

  7f78e0351394 ("fs: Limit sys_mount to only request filesystem modules.")
  5d26a105b5a7 ("crypto: prefix module autoloading with "crypto-"")
  4943ba16bbc2 ("crypto: include crypto- module prefix in template")

> Which argues for being extremely conservative about making something
> that has an extremely low probability of breaking users, and it points
> out why Geo Kozey's "who cares about breaking users; security is
> IMPORTANT" argument is so wrong-headed.

I don't want to break users either. I want to provide meaningful ways
for admins to reduce their kernel attack surface. Djalal and I aren't
advocating for on-by-default removal of module auto-loading (that
would have been a very small patch!). The idea was to provide a
dynamic control over it, and make that available to distros. As shown
in the patch series, it would be immediately put to use with systemd
for process-tree isolation and for container restriction.

> If the goal is to protect distro kernels, but a sloppy approach
> guarantees that distro kernels will never enable it, is it really
> worth it?

I don't think this is sloppy and of course distros would see use,
since systemd would be using it.

> P.S.  This is where it might be useful to get some input from the Red
> Hat and SuSE support teams.  How many angry user calls to their help
> desk are they willing to field before they'll just turn off the kernel
> config option for their kernels?

This isn't about giant-hammer CONFIGs -- this is like more like
PR_SET_DUMPABLE or seccomp.

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29  0:18                               ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-29  0:18 UTC (permalink / raw)
  To: Theodore Ts'o, Kees Cook, Linus Torvalds, Djalal Harouni,
	Jonathan Corbet, James Morris, LSM List,
	Linux Kernel Mailing List, kernel-hardening, Geo Kozey,
	Tycho Andersen

On Tue, Nov 28, 2017 at 3:49 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> OK, but if the goal is to protect users who are running distro
> kernels, then a kernel config that breaks some percentage of users is
> ****highly**** unlikely to be enabled by Red Hat and SuSE, right?  And
> many of these users either can't (from a skills perspective) or won't
> (because they lose all support from the enterprise distro's help desk)
> recompile their kernel to enable an "might break 3% of all users ---
> oh, well" config option.

There's a spectrum across "insane not to be done everywhere"
(STRICT_KERNEL_RWX), "this is a good idea for nearly all Linux
systems" (link restrictions), "this can break some common use-cases,
but protects systems without that use-case" (user-namespace
disabling), "this breaks most systems, but specialized deployments
really need it" (modules_disabled).

There's also a difference between immutable CONFIG options that cannot
be disabled at runtime, those that can, global sysctls, per-namespace
controls, etc etc. The kernel is all about providing admins with knobs
to tweak their performance and security. Suddenly being told that we
can't create optional improvements is very odd.

Now, being told "make it easier to audit all the module loading we're
already doing so we can further reduce needless exposures for everyone
even without this feature enabled", THAT makes sense. My point there
is that we've already done those kinds of things; see commits like:

  7f78e0351394 ("fs: Limit sys_mount to only request filesystem modules.")
  5d26a105b5a7 ("crypto: prefix module autoloading with "crypto-"")
  4943ba16bbc2 ("crypto: include crypto- module prefix in template")

> Which argues for being extremely conservative about making something
> that has an extremely low probability of breaking users, and it points
> out why Geo Kozey's "who cares about breaking users; security is
> IMPORTANT" argument is so wrong-headed.

I don't want to break users either. I want to provide meaningful ways
for admins to reduce their kernel attack surface. Djalal and I aren't
advocating for on-by-default removal of module auto-loading (that
would have been a very small patch!). The idea was to provide a
dynamic control over it, and make that available to distros. As shown
in the patch series, it would be immediately put to use with systemd
for process-tree isolation and for container restriction.

> If the goal is to protect distro kernels, but a sloppy approach
> guarantees that distro kernels will never enable it, is it really
> worth it?

I don't think this is sloppy and of course distros would see use,
since systemd would be using it.

> P.S.  This is where it might be useful to get some input from the Red
> Hat and SuSE support teams.  How many angry user calls to their help
> desk are they willing to field before they'll just turn off the kernel
> config option for their kernels?

This isn't about giant-hammer CONFIGs -- this is like more like
PR_SET_DUMPABLE or seccomp.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-29  0:17                           ` Linus Torvalds
  (?)
@ 2017-11-29  0:26                             ` Kees Cook
  -1 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-29  0:26 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Geo Kozey, LSM List, Linux Kernel Mailing List, kernel-hardening

On Tue, Nov 28, 2017 at 4:17 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> But at the same time, if people can't even be bothered to try to
> improve the general case, and only do things that you have to opt in
> for, it really isn't much of an improvement. We had this whole

Right, and I'm fine to help with those general improvements. As I
mentioned in another reply, we've been steadily narrowing the
exposures by keeping request_module() from being exposed to "all
possible modules" in many of the places it's been exposed to
userspace:

  7f78e0351394 ("fs: Limit sys_mount to only request filesystem modules.")
  5d26a105b5a7 ("crypto: prefix module autoloading with "crypto-"")
  4943ba16bbc2 ("crypto: include crypto- module prefix in template")

I don't want you to think this is an area where work hasn't already
been done to try to improve things.

> The model that I am a proponent of is to take a softer approach
> initially: don't forbid module loading (because that breaks users),
> but instead _warn_ about non-root module loading. And then we can
> start fixing the cases that we find.

I am totally fine with this. The question I'm hoping to have answered
is, "then what?" We already have concrete examples of module
autoloading that will still be need to stay unprivileged and as-is in
the kernel (even if we remove others). What do you see as the way to
allow an admin to turn those off?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29  0:26                             ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-29  0:26 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 4:17 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> But at the same time, if people can't even be bothered to try to
> improve the general case, and only do things that you have to opt in
> for, it really isn't much of an improvement. We had this whole

Right, and I'm fine to help with those general improvements. As I
mentioned in another reply, we've been steadily narrowing the
exposures by keeping request_module() from being exposed to "all
possible modules" in many of the places it's been exposed to
userspace:

  7f78e0351394 ("fs: Limit sys_mount to only request filesystem modules.")
  5d26a105b5a7 ("crypto: prefix module autoloading with "crypto-"")
  4943ba16bbc2 ("crypto: include crypto- module prefix in template")

I don't want you to think this is an area where work hasn't already
been done to try to improve things.

> The model that I am a proponent of is to take a softer approach
> initially: don't forbid module loading (because that breaks users),
> but instead _warn_ about non-root module loading. And then we can
> start fixing the cases that we find.

I am totally fine with this. The question I'm hoping to have answered
is, "then what?" We already have concrete examples of module
autoloading that will still be need to stay unprivileged and as-is in
the kernel (even if we remove others). What do you see as the way to
allow an admin to turn those off?

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29  0:26                             ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-29  0:26 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Geo Kozey, LSM List, Linux Kernel Mailing List, kernel-hardening

On Tue, Nov 28, 2017 at 4:17 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> But at the same time, if people can't even be bothered to try to
> improve the general case, and only do things that you have to opt in
> for, it really isn't much of an improvement. We had this whole

Right, and I'm fine to help with those general improvements. As I
mentioned in another reply, we've been steadily narrowing the
exposures by keeping request_module() from being exposed to "all
possible modules" in many of the places it's been exposed to
userspace:

  7f78e0351394 ("fs: Limit sys_mount to only request filesystem modules.")
  5d26a105b5a7 ("crypto: prefix module autoloading with "crypto-"")
  4943ba16bbc2 ("crypto: include crypto- module prefix in template")

I don't want you to think this is an area where work hasn't already
been done to try to improve things.

> The model that I am a proponent of is to take a softer approach
> initially: don't forbid module loading (because that breaks users),
> but instead _warn_ about non-root module loading. And then we can
> start fixing the cases that we find.

I am totally fine with this. The question I'm hoping to have answered
is, "then what?" We already have concrete examples of module
autoloading that will still be need to stay unprivileged and as-is in
the kernel (even if we remove others). What do you see as the way to
allow an admin to turn those off?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-29  0:26                             ` Kees Cook
  (?)
@ 2017-11-29  0:50                               ` Linus Torvalds
  -1 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29  0:50 UTC (permalink / raw)
  To: Kees Cook
  Cc: Geo Kozey, LSM List, Linux Kernel Mailing List, kernel-hardening

On Tue, Nov 28, 2017 at 4:26 PM, Kees Cook <keescook@chromium.org> wrote:
>
>> The model that I am a proponent of is to take a softer approach
>> initially: don't forbid module loading (because that breaks users),
>> but instead _warn_ about non-root module loading. And then we can
>> start fixing the cases that we find.
>
> I am totally fine with this. The question I'm hoping to have answered
> is, "then what?" We already have concrete examples of module
> autoloading that will still be need to stay unprivileged and as-is in
> the kernel (even if we remove others). What do you see as the way to
> allow an admin to turn those off?

Just thinking about the DCCP case, where networking people actually
knew it was pretty deprecated and had no real maintainer, I think one
thing to look at would be simply a per-module flag.

That kind of thing should be fairly easy to implement, along the same
lines as the module license - it just sets a flag in the ELF section
headers.

With something like that, we literally could make the default be "no
autoloading except for root", and then just mark the modules that we
think are ok and well maintained.

Sure, if you then do a lock-down mode that makes that flag parsing
stricter, then that's a separate thing. But I suspect we definitely
could be a lot stricter on a per-module basis, and do it in a way
where a normal user wouldn't even notice that we've limited the
autoloading.

But the first step would be to just add some noise. And even with the
per-module flag, at first it would only suppress the noise (ie we'd
still _allow_ loading other modules, they'd just be noisy). Then, if
nobody hollers, maybe the next kernel release we'll make it actually
enforce the flag.

Does that sound reasonable?

               Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29  0:50                               ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29  0:50 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 4:26 PM, Kees Cook <keescook@chromium.org> wrote:
>
>> The model that I am a proponent of is to take a softer approach
>> initially: don't forbid module loading (because that breaks users),
>> but instead _warn_ about non-root module loading. And then we can
>> start fixing the cases that we find.
>
> I am totally fine with this. The question I'm hoping to have answered
> is, "then what?" We already have concrete examples of module
> autoloading that will still be need to stay unprivileged and as-is in
> the kernel (even if we remove others). What do you see as the way to
> allow an admin to turn those off?

Just thinking about the DCCP case, where networking people actually
knew it was pretty deprecated and had no real maintainer, I think one
thing to look at would be simply a per-module flag.

That kind of thing should be fairly easy to implement, along the same
lines as the module license - it just sets a flag in the ELF section
headers.

With something like that, we literally could make the default be "no
autoloading except for root", and then just mark the modules that we
think are ok and well maintained.

Sure, if you then do a lock-down mode that makes that flag parsing
stricter, then that's a separate thing. But I suspect we definitely
could be a lot stricter on a per-module basis, and do it in a way
where a normal user wouldn't even notice that we've limited the
autoloading.

But the first step would be to just add some noise. And even with the
per-module flag, at first it would only suppress the noise (ie we'd
still _allow_ loading other modules, they'd just be noisy). Then, if
nobody hollers, maybe the next kernel release we'll make it actually
enforce the flag.

Does that sound reasonable?

               Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29  0:50                               ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29  0:50 UTC (permalink / raw)
  To: Kees Cook
  Cc: Geo Kozey, LSM List, Linux Kernel Mailing List, kernel-hardening

On Tue, Nov 28, 2017 at 4:26 PM, Kees Cook <keescook@chromium.org> wrote:
>
>> The model that I am a proponent of is to take a softer approach
>> initially: don't forbid module loading (because that breaks users),
>> but instead _warn_ about non-root module loading. And then we can
>> start fixing the cases that we find.
>
> I am totally fine with this. The question I'm hoping to have answered
> is, "then what?" We already have concrete examples of module
> autoloading that will still be need to stay unprivileged and as-is in
> the kernel (even if we remove others). What do you see as the way to
> allow an admin to turn those off?

Just thinking about the DCCP case, where networking people actually
knew it was pretty deprecated and had no real maintainer, I think one
thing to look at would be simply a per-module flag.

That kind of thing should be fairly easy to implement, along the same
lines as the module license - it just sets a flag in the ELF section
headers.

With something like that, we literally could make the default be "no
autoloading except for root", and then just mark the modules that we
think are ok and well maintained.

Sure, if you then do a lock-down mode that makes that flag parsing
stricter, then that's a separate thing. But I suspect we definitely
could be a lot stricter on a per-module basis, and do it in a way
where a normal user wouldn't even notice that we've limited the
autoloading.

But the first step would be to just add some noise. And even with the
per-module flag, at first it would only suppress the noise (ie we'd
still _allow_ loading other modules, they'd just be noisy). Then, if
nobody hollers, maybe the next kernel release we'll make it actually
enforce the flag.

Does that sound reasonable?

               Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-29  0:50                               ` Linus Torvalds
  (?)
@ 2017-11-29  4:26                                 ` Eric W. Biederman
  -1 siblings, 0 replies; 266+ messages in thread
From: Eric W. Biederman @ 2017-11-29  4:26 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, Geo Kozey, LSM List, Linux Kernel Mailing List,
	kernel-hardening, Mahesh Bandewar, Serge E. Hallyn

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Tue, Nov 28, 2017 at 4:26 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>>> The model that I am a proponent of is to take a softer approach
>>> initially: don't forbid module loading (because that breaks users),
>>> but instead _warn_ about non-root module loading. And then we can
>>> start fixing the cases that we find.
>>
>> I am totally fine with this. The question I'm hoping to have answered
>> is, "then what?" We already have concrete examples of module
>> autoloading that will still be need to stay unprivileged and as-is in
>> the kernel (even if we remove others). What do you see as the way to
>> allow an admin to turn those off?
>
> Just thinking about the DCCP case, where networking people actually
> knew it was pretty deprecated and had no real maintainer, I think one
> thing to look at would be simply a per-module flag.
>
> That kind of thing should be fairly easy to implement, along the same
> lines as the module license - it just sets a flag in the ELF section
> headers.
>
> With something like that, we literally could make the default be "no
> autoloading except for root", and then just mark the modules that we
> think are ok and well maintained.
>
> Sure, if you then do a lock-down mode that makes that flag parsing
> stricter, then that's a separate thing. But I suspect we definitely
> could be a lot stricter on a per-module basis, and do it in a way
> where a normal user wouldn't even notice that we've limited the
> autoloading.
>
> But the first step would be to just add some noise. And even with the
> per-module flag, at first it would only suppress the noise (ie we'd
> still _allow_ loading other modules, they'd just be noisy). Then, if
> nobody hollers, maybe the next kernel release we'll make it actually
> enforce the flag.
>

On a slight tangent to all of this.

The issue of reducing attack surface has also come up in another thread
and it was suggested there that we make some ns_capable calls
conditionally capable calls so certain pieces of code won't be available
in user namespaces, when we know there is a bug but don't yet have a
good fix rolled out yet.

Which again brings us to attack surface reduction.

I was wondering if perhaps a better way to do some of that, might be to
have places in the kernel where we could use something like ftrace to
add a permission check at a well known functions boundaries and fail the
functions when we want to reduce the attack surface.

It is not as elegant as adding a maintenance status to a module and only
allowing actively maintained modules to be auto-loaded.  But perhaps
with a few more eyes the idea can be fleshed out to something generally useful.

Eric

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29  4:26                                 ` Eric W. Biederman
  0 siblings, 0 replies; 266+ messages in thread
From: Eric W. Biederman @ 2017-11-29  4:26 UTC (permalink / raw)
  To: linux-security-module

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Tue, Nov 28, 2017 at 4:26 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>>> The model that I am a proponent of is to take a softer approach
>>> initially: don't forbid module loading (because that breaks users),
>>> but instead _warn_ about non-root module loading. And then we can
>>> start fixing the cases that we find.
>>
>> I am totally fine with this. The question I'm hoping to have answered
>> is, "then what?" We already have concrete examples of module
>> autoloading that will still be need to stay unprivileged and as-is in
>> the kernel (even if we remove others). What do you see as the way to
>> allow an admin to turn those off?
>
> Just thinking about the DCCP case, where networking people actually
> knew it was pretty deprecated and had no real maintainer, I think one
> thing to look at would be simply a per-module flag.
>
> That kind of thing should be fairly easy to implement, along the same
> lines as the module license - it just sets a flag in the ELF section
> headers.
>
> With something like that, we literally could make the default be "no
> autoloading except for root", and then just mark the modules that we
> think are ok and well maintained.
>
> Sure, if you then do a lock-down mode that makes that flag parsing
> stricter, then that's a separate thing. But I suspect we definitely
> could be a lot stricter on a per-module basis, and do it in a way
> where a normal user wouldn't even notice that we've limited the
> autoloading.
>
> But the first step would be to just add some noise. And even with the
> per-module flag, at first it would only suppress the noise (ie we'd
> still _allow_ loading other modules, they'd just be noisy). Then, if
> nobody hollers, maybe the next kernel release we'll make it actually
> enforce the flag.
>

On a slight tangent to all of this.

The issue of reducing attack surface has also come up in another thread
and it was suggested there that we make some ns_capable calls
conditionally capable calls so certain pieces of code won't be available
in user namespaces, when we know there is a bug but don't yet have a
good fix rolled out yet.

Which again brings us to attack surface reduction.

I was wondering if perhaps a better way to do some of that, might be to
have places in the kernel where we could use something like ftrace to
add a permission check at a well known functions boundaries and fail the
functions when we want to reduce the attack surface.

It is not as elegant as adding a maintenance status to a module and only
allowing actively maintained modules to be auto-loaded.  But perhaps
with a few more eyes the idea can be fleshed out to something generally useful.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29  4:26                                 ` Eric W. Biederman
  0 siblings, 0 replies; 266+ messages in thread
From: Eric W. Biederman @ 2017-11-29  4:26 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, Geo Kozey, LSM List, Linux Kernel Mailing List,
	kernel-hardening,
	Mahesh Bandewar (महेश
	बंडेवार),
	Serge E. Hallyn

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Tue, Nov 28, 2017 at 4:26 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>>> The model that I am a proponent of is to take a softer approach
>>> initially: don't forbid module loading (because that breaks users),
>>> but instead _warn_ about non-root module loading. And then we can
>>> start fixing the cases that we find.
>>
>> I am totally fine with this. The question I'm hoping to have answered
>> is, "then what?" We already have concrete examples of module
>> autoloading that will still be need to stay unprivileged and as-is in
>> the kernel (even if we remove others). What do you see as the way to
>> allow an admin to turn those off?
>
> Just thinking about the DCCP case, where networking people actually
> knew it was pretty deprecated and had no real maintainer, I think one
> thing to look at would be simply a per-module flag.
>
> That kind of thing should be fairly easy to implement, along the same
> lines as the module license - it just sets a flag in the ELF section
> headers.
>
> With something like that, we literally could make the default be "no
> autoloading except for root", and then just mark the modules that we
> think are ok and well maintained.
>
> Sure, if you then do a lock-down mode that makes that flag parsing
> stricter, then that's a separate thing. But I suspect we definitely
> could be a lot stricter on a per-module basis, and do it in a way
> where a normal user wouldn't even notice that we've limited the
> autoloading.
>
> But the first step would be to just add some noise. And even with the
> per-module flag, at first it would only suppress the noise (ie we'd
> still _allow_ loading other modules, they'd just be noisy). Then, if
> nobody hollers, maybe the next kernel release we'll make it actually
> enforce the flag.
>

On a slight tangent to all of this.

The issue of reducing attack surface has also come up in another thread
and it was suggested there that we make some ns_capable calls
conditionally capable calls so certain pieces of code won't be available
in user namespaces, when we know there is a bug but don't yet have a
good fix rolled out yet.

Which again brings us to attack surface reduction.

I was wondering if perhaps a better way to do some of that, might be to
have places in the kernel where we could use something like ftrace to
add a permission check at a well known functions boundaries and fail the
functions when we want to reduce the attack surface.

It is not as elegant as adding a maintenance status to a module and only
allowing actively maintained modules to be auto-loaded.  But perhaps
with a few more eyes the idea can be fleshed out to something generally useful.

Eric

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-29  0:18                               ` Kees Cook
  (?)
@ 2017-11-29  6:36                                 ` Theodore Ts'o
  -1 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-29  6:36 UTC (permalink / raw)
  To: Kees Cook
  Cc: Linus Torvalds, Djalal Harouni, Jonathan Corbet, James Morris,
	LSM List, Linux Kernel Mailing List, kernel-hardening, Geo Kozey,
	Tycho Andersen

On Tue, Nov 28, 2017 at 04:18:59PM -0800, Kees Cook wrote:
> There's also a difference between immutable CONFIG options that cannot
> be disabled at runtime, those that can, global sysctls, per-namespace
> controls, etc etc. The kernel is all about providing admins with knobs
> to tweak their performance and security. Suddenly being told that we
> can't create optional improvements is very odd.

I just think that tweakable knobs are mostly pointless.  From my
experience the number of sysadmins that adjust knobs is ***tiny***[1].
Put another way, the effort to determine whether tweaking a knob will
result in breakages or will be safe is as much work as creating a
white list of modules that are allowed to be loaded.

[1] And I say that having providing a lot of knobs for ext4.  :-)

This is why some on the kernel-hardening list have argued for making
the default to be opt-out, which means some users will be breaken (and
their answer to that seems to be, "oh well --- gotta break some eggs
to make an omlette".  Sucks if you're one of the eggs, though.)

And I don't see how systemd magically means no one will be broken.  If
you have a non-root process trying to invoke a line discpline which
has to be loaded as a module, if you flip the switch, that process
will be broken.  How does using systemd make the problem go away?

					- Ted

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29  6:36                                 ` Theodore Ts'o
  0 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-29  6:36 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 04:18:59PM -0800, Kees Cook wrote:
> There's also a difference between immutable CONFIG options that cannot
> be disabled at runtime, those that can, global sysctls, per-namespace
> controls, etc etc. The kernel is all about providing admins with knobs
> to tweak their performance and security. Suddenly being told that we
> can't create optional improvements is very odd.

I just think that tweakable knobs are mostly pointless.  From my
experience the number of sysadmins that adjust knobs is ***tiny***[1].
Put another way, the effort to determine whether tweaking a knob will
result in breakages or will be safe is as much work as creating a
white list of modules that are allowed to be loaded.

[1] And I say that having providing a lot of knobs for ext4.  :-)

This is why some on the kernel-hardening list have argued for making
the default to be opt-out, which means some users will be breaken (and
their answer to that seems to be, "oh well --- gotta break some eggs
to make an omlette".  Sucks if you're one of the eggs, though.)

And I don't see how systemd magically means no one will be broken.  If
you have a non-root process trying to invoke a line discpline which
has to be loaded as a module, if you flip the switch, that process
will be broken.  How does using systemd make the problem go away?

					- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29  6:36                                 ` Theodore Ts'o
  0 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-29  6:36 UTC (permalink / raw)
  To: Kees Cook
  Cc: Linus Torvalds, Djalal Harouni, Jonathan Corbet, James Morris,
	LSM List, Linux Kernel Mailing List, kernel-hardening, Geo Kozey,
	Tycho Andersen

On Tue, Nov 28, 2017 at 04:18:59PM -0800, Kees Cook wrote:
> There's also a difference between immutable CONFIG options that cannot
> be disabled at runtime, those that can, global sysctls, per-namespace
> controls, etc etc. The kernel is all about providing admins with knobs
> to tweak their performance and security. Suddenly being told that we
> can't create optional improvements is very odd.

I just think that tweakable knobs are mostly pointless.  From my
experience the number of sysadmins that adjust knobs is ***tiny***[1].
Put another way, the effort to determine whether tweaking a knob will
result in breakages or will be safe is as much work as creating a
white list of modules that are allowed to be loaded.

[1] And I say that having providing a lot of knobs for ext4.  :-)

This is why some on the kernel-hardening list have argued for making
the default to be opt-out, which means some users will be breaken (and
their answer to that seems to be, "oh well --- gotta break some eggs
to make an omlette".  Sucks if you're one of the eggs, though.)

And I don't see how systemd magically means no one will be broken.  If
you have a non-root process trying to invoke a line discpline which
has to be loaded as a module, if you flip the switch, that process
will be broken.  How does using systemd make the problem go away?

					- Ted

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-28 22:48                 ` Luis R. Rodriguez
  (?)
@ 2017-11-29  7:49                   ` Michal Kubecek
  -1 siblings, 0 replies; 266+ messages in thread
From: Michal Kubecek @ 2017-11-29  7:49 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Kees Cook, Djalal Harouni, Andy Lutomirski, Andrew Morton,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, LKML, linux-security-module,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 11:48:49PM +0100, Luis R. Rodriguez wrote:
> On Tue, Nov 28, 2017 at 02:18:18PM -0800, Kees Cook wrote:
> > On Tue, Nov 28, 2017 at 2:12 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > On Tue, Nov 28, 2017 at 01:39:58PM -0800, Kees Cook wrote:
> > >> On Tue, Nov 28, 2017 at 1:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > >> > And *all* auto-loading uses aliases? What's the difference
> > >> > between auto-loading and direct-loading?
> > >>
> > >> The difference is the process privileges. Unprivilged autoloading
> > >> (e.g. int n_hdlc = N_HDLC; ioctl(fd,
> > >> TIOCSETD, &n_hdlc)), triggers a privileged call to finit_module()
> > >> under CAP_SYS_MODULE.
> > >
> > > Ah, so system call implicated request_module() calls.
> > 
> > Yup. Unprivileged user does something that ultimately hits a
> > request_module() in the kernel. Then the kernel calls out with the
> > usermode helper (which has CAP_SYS_MODULE) and calls finit_module().
> 
> Thanks, using this terminology is much better to understand than
> auto-loading, given it does make it clear an unprivileged call was one
> that initiated the request_module() call, there are many uses of
> request_module() which *are* privileged.
> 
> > > OK and since CAP_SYS_MODULE is much more restrictive one could
> > > argue, what's the point here?
> > 
> > The goal is to block an unprivileged user from being able to trigger a
> > module load without blocking root from loading modules directly.
> 
> I see now. Do we have an audit of all system calls which implicate a
> request_module() call? Networking is a good example for sure to start
> off with but I was curious if we have a grasp of how wide spread this
> could be.

I'm not sure it makes sense to classify this by syscalls. In networking,
request_module() can be triggered e.g. by a netlink message (genetlink
family lookup is an example not needing any privileges) so that one of
the syscalls would be sendmsg().

Michal Kubecek

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-29  7:49                   ` Michal Kubecek
  0 siblings, 0 replies; 266+ messages in thread
From: Michal Kubecek @ 2017-11-29  7:49 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 11:48:49PM +0100, Luis R. Rodriguez wrote:
> On Tue, Nov 28, 2017 at 02:18:18PM -0800, Kees Cook wrote:
> > On Tue, Nov 28, 2017 at 2:12 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > On Tue, Nov 28, 2017 at 01:39:58PM -0800, Kees Cook wrote:
> > >> On Tue, Nov 28, 2017 at 1:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > >> > And *all* auto-loading uses aliases? What's the difference
> > >> > between auto-loading and direct-loading?
> > >>
> > >> The difference is the process privileges. Unprivilged autoloading
> > >> (e.g. int n_hdlc = N_HDLC; ioctl(fd,
> > >> TIOCSETD, &n_hdlc)), triggers a privileged call to finit_module()
> > >> under CAP_SYS_MODULE.
> > >
> > > Ah, so system call implicated request_module() calls.
> > 
> > Yup. Unprivileged user does something that ultimately hits a
> > request_module() in the kernel. Then the kernel calls out with the
> > usermode helper (which has CAP_SYS_MODULE) and calls finit_module().
> 
> Thanks, using this terminology is much better to understand than
> auto-loading, given it does make it clear an unprivileged call was one
> that initiated the request_module() call, there are many uses of
> request_module() which *are* privileged.
> 
> > > OK and since CAP_SYS_MODULE is much more restrictive one could
> > > argue, what's the point here?
> > 
> > The goal is to block an unprivileged user from being able to trigger a
> > module load without blocking root from loading modules directly.
> 
> I see now. Do we have an audit of all system calls which implicate a
> request_module() call? Networking is a good example for sure to start
> off with but I was curious if we have a grasp of how wide spread this
> could be.

I'm not sure it makes sense to classify this by syscalls. In networking,
request_module() can be triggered e.g. by a netlink message (genetlink
family lookup is an example not needing any privileges) so that one of
the syscalls would be sendmsg().

Michal Kubecek
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-29  7:49                   ` Michal Kubecek
  0 siblings, 0 replies; 266+ messages in thread
From: Michal Kubecek @ 2017-11-29  7:49 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Kees Cook, Djalal Harouni, Andy Lutomirski, Andrew Morton,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, LKML, linux-security-module,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

On Tue, Nov 28, 2017 at 11:48:49PM +0100, Luis R. Rodriguez wrote:
> On Tue, Nov 28, 2017 at 02:18:18PM -0800, Kees Cook wrote:
> > On Tue, Nov 28, 2017 at 2:12 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > On Tue, Nov 28, 2017 at 01:39:58PM -0800, Kees Cook wrote:
> > >> On Tue, Nov 28, 2017 at 1:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > >> > And *all* auto-loading uses aliases? What's the difference
> > >> > between auto-loading and direct-loading?
> > >>
> > >> The difference is the process privileges. Unprivilged autoloading
> > >> (e.g. int n_hdlc = N_HDLC; ioctl(fd,
> > >> TIOCSETD, &n_hdlc)), triggers a privileged call to finit_module()
> > >> under CAP_SYS_MODULE.
> > >
> > > Ah, so system call implicated request_module() calls.
> > 
> > Yup. Unprivileged user does something that ultimately hits a
> > request_module() in the kernel. Then the kernel calls out with the
> > usermode helper (which has CAP_SYS_MODULE) and calls finit_module().
> 
> Thanks, using this terminology is much better to understand than
> auto-loading, given it does make it clear an unprivileged call was one
> that initiated the request_module() call, there are many uses of
> request_module() which *are* privileged.
> 
> > > OK and since CAP_SYS_MODULE is much more restrictive one could
> > > argue, what's the point here?
> > 
> > The goal is to block an unprivileged user from being able to trigger a
> > module load without blocking root from loading modules directly.
> 
> I see now. Do we have an audit of all system calls which implicate a
> request_module() call? Networking is a good example for sure to start
> off with but I was curious if we have a grasp of how wide spread this
> could be.

I'm not sure it makes sense to classify this by syscalls. In networking,
request_module() can be triggered e.g. by a netlink message (genetlink
family lookup is an example not needing any privileges) so that one of
the syscalls would be sendmsg().

Michal Kubecek

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-28 21:39           ` Kees Cook
  (?)
@ 2017-11-29 13:46             ` Alan Cox
  -1 siblings, 0 replies; 266+ messages in thread
From: Alan Cox @ 2017-11-29 13:46 UTC (permalink / raw)
  To: Kees Cook
  Cc: Luis R. Rodriguez, Djalal Harouni, Andy Lutomirski,
	Andrew Morton, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell, LKML,
	linux-security-module, kernel-hardening, Jonathan Corbet,
	Ingo Molnar, David S. Miller, Network Development,
	Peter Zijlstra, Linus Torvalds

On Tue, 28 Nov 2017 13:39:58 -0800
Kees Cook <keescook@chromium.org> wrote:

> On Tue, Nov 28, 2017 at 1:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > And *all* auto-loading uses aliases? What's the difference between auto-loading
> > and direct-loading?  
> 
> The difference is the process privileges. Unprivilged autoloading
> (e.g. int n_hdlc = N_HDLC; ioctl(fd,
> TIOCSETD, &n_hdlc)), triggers a privileged call to finit_module()
> under CAP_SYS_MODULE.

If you have CAP_SYS_DAC you can rename any module to ppp.ko and ask the
network manager (which has the right permissions) to init a ppp
connection. Capabilities alone are simply not enough to do any kind of
useful protection on a current system and the Linux capability model is
broken architecturally and not fixable because fixing it would break lots
of real systems.

I really don't care what the module loading rules end up with and whether
we add CAP_SYS_YET_ANOTHER_MEANINGLESS_FLAG but what is actually needed
is to properly incorporate it into securiy ruiles for whatever LSM you
are using.

Alan

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-29 13:46             ` Alan Cox
  0 siblings, 0 replies; 266+ messages in thread
From: Alan Cox @ 2017-11-29 13:46 UTC (permalink / raw)
  To: linux-security-module

On Tue, 28 Nov 2017 13:39:58 -0800
Kees Cook <keescook@chromium.org> wrote:

> On Tue, Nov 28, 2017 at 1:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > And *all* auto-loading uses aliases? What's the difference between auto-loading
> > and direct-loading?  
> 
> The difference is the process privileges. Unprivilged autoloading
> (e.g. int n_hdlc = N_HDLC; ioctl(fd,
> TIOCSETD, &n_hdlc)), triggers a privileged call to finit_module()
> under CAP_SYS_MODULE.

If you have CAP_SYS_DAC you can rename any module to ppp.ko and ask the
network manager (which has the right permissions) to init a ppp
connection. Capabilities alone are simply not enough to do any kind of
useful protection on a current system and the Linux capability model is
broken architecturally and not fixable because fixing it would break lots
of real systems.

I really don't care what the module loading rules end up with and whether
we add CAP_SYS_YET_ANOTHER_MEANINGLESS_FLAG but what is actually needed
is to properly incorporate it into securiy ruiles for whatever LSM you
are using.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-29 13:46             ` Alan Cox
  0 siblings, 0 replies; 266+ messages in thread
From: Alan Cox @ 2017-11-29 13:46 UTC (permalink / raw)
  To: Kees Cook
  Cc: Luis R. Rodriguez, Djalal Harouni, Andy Lutomirski,
	Andrew Morton, James Morris, Ben Hutchings, Solar Designer,
	Serge Hallyn, Jessica Yu, Rusty Russell, LKML,
	linux-security-module, kernel-hardening, Jonathan Corbet,
	Ingo Molnar, David S. Miller, Network Development,
	Peter Zijlstra, Linus Torvalds

On Tue, 28 Nov 2017 13:39:58 -0800
Kees Cook <keescook@chromium.org> wrote:

> On Tue, Nov 28, 2017 at 1:16 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > And *all* auto-loading uses aliases? What's the difference between auto-loading
> > and direct-loading?  
> 
> The difference is the process privileges. Unprivilged autoloading
> (e.g. int n_hdlc = N_HDLC; ioctl(fd,
> TIOCSETD, &n_hdlc)), triggers a privileged call to finit_module()
> under CAP_SYS_MODULE.

If you have CAP_SYS_DAC you can rename any module to ppp.ko and ask the
network manager (which has the right permissions) to init a ppp
connection. Capabilities alone are simply not enough to do any kind of
useful protection on a current system and the Linux capability model is
broken architecturally and not fixable because fixing it would break lots
of real systems.

I really don't care what the module loading rules end up with and whether
we add CAP_SYS_YET_ANOTHER_MEANINGLESS_FLAG but what is actually needed
is to properly incorporate it into securiy ruiles for whatever LSM you
are using.

Alan

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-28 23:49                             ` Theodore Ts'o
  (?)
@ 2017-11-29 14:46                               ` Geo Kozey
  -1 siblings, 0 replies; 266+ messages in thread
From: Geo Kozey @ 2017-11-29 14:46 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: LSM List, Linux Kernel Mailing List, kernel-hardening

> From: Theodore Ts'o <tytso@mit.edu>
> Sent: Wed Nov 29 00:49:20 CET 2017
> Subject: Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
> 
> OK, but if the goal is to protect users who are running distro
> kernels, then a kernel config that breaks some percentage of users is
> ****highly**** unlikely to be enabled by Red Hat and SuSE, right?  And
> many of these users either can't (from a skills perspective) or won't
> (because they lose all support from the enterprise distro's help desk)
> recompile their kernel to enable an "might break 3% of all users ---
> oh, well" config option.
> 
> Which argues for being extremely conservative about making something
> that has an extremely low probability of breaking users, and it points
> out why Geo Kozey's "who cares about breaking users; security is
> IMPORTANT" argument is so wrong-headed.
> ----------------------------------------

My argument was the opposite - don't break anything by default. Nobody here wanted to break any users. Some of them may choose stricter option because it actually doesn't break anything relevant for them. That's all. It's the same as perf_event_paranoid, kptr_restrict, yama.ptrace_scope sysctls are working currently.

We're talking about runtime, not compile time option. I doubt RH or SUSE prevent users from changing anything at runtime.


G. K.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29 14:46                               ` Geo Kozey
  0 siblings, 0 replies; 266+ messages in thread
From: Geo Kozey @ 2017-11-29 14:46 UTC (permalink / raw)
  To: linux-security-module

> From: Theodore Ts'o <tytso@mit.edu>
> Sent: Wed Nov 29 00:49:20 CET 2017
> Subject: Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
> 
> OK, but if the goal is to protect users who are running distro
> kernels, then a kernel config that breaks some percentage of users is
> ****highly**** unlikely to be enabled by Red Hat and SuSE, right?  And
> many of these users either can't (from a skills perspective) or won't
> (because they lose all support from the enterprise distro's help desk)
> recompile their kernel to enable an "might break 3% of all users ---
> oh, well" config option.
> 
> Which argues for being extremely conservative about making something
> that has an extremely low probability of breaking users, and it points
> out why Geo Kozey's "who cares about breaking users; security is
> IMPORTANT" argument is so wrong-headed.
> ----------------------------------------

My argument was the opposite - don't break anything by default. Nobody here wanted to break any users. Some of them may choose stricter option because it actually doesn't break anything relevant for them. That's all. It's the same as perf_event_paranoid, kptr_restrict, yama.ptrace_scope sysctls are working currently.

We're talking about runtime, not compile time option. I doubt RH or SUSE prevent users from changing anything at runtime.


G. K.
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29 14:46                               ` Geo Kozey
  0 siblings, 0 replies; 266+ messages in thread
From: Geo Kozey @ 2017-11-29 14:46 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: LSM List, Linux Kernel Mailing List, kernel-hardening

> From: Theodore Ts'o <tytso@mit.edu>
> Sent: Wed Nov 29 00:49:20 CET 2017
> Subject: Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
> 
> OK, but if the goal is to protect users who are running distro
> kernels, then a kernel config that breaks some percentage of users is
> ****highly**** unlikely to be enabled by Red Hat and SuSE, right?  And
> many of these users either can't (from a skills perspective) or won't
> (because they lose all support from the enterprise distro's help desk)
> recompile their kernel to enable an "might break 3% of all users ---
> oh, well" config option.
> 
> Which argues for being extremely conservative about making something
> that has an extremely low probability of breaking users, and it points
> out why Geo Kozey's "who cares about breaking users; security is
> IMPORTANT" argument is so wrong-headed.
> ----------------------------------------

My argument was the opposite - don't break anything by default. Nobody here wanted to break any users. Some of them may choose stricter option because it actually doesn't break anything relevant for them. That's all. It's the same as perf_event_paranoid, kptr_restrict, yama.ptrace_scope sysctls are working currently.

We're talking about runtime, not compile time option. I doubt RH or SUSE prevent users from changing anything at runtime.


G. K.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-29 13:46             ` Alan Cox
  (?)
@ 2017-11-29 14:50               ` David Miller
  -1 siblings, 0 replies; 266+ messages in thread
From: David Miller @ 2017-11-29 14:50 UTC (permalink / raw)
  To: gnomes
  Cc: keescook, mcgrof, tixxdz, luto, akpm, james.l.morris,
	ben.hutchings, solar, serge, jeyu, rusty, linux-kernel,
	linux-security-module, kernel-hardening, corbet, mingo, netdev,
	peterz, torvalds

From: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Date: Wed, 29 Nov 2017 13:46:12 +0000

> I really don't care what the module loading rules end up with and
> whether we add CAP_SYS_YET_ANOTHER_MEANINGLESS_FLAG but what is
> actually needed is to properly incorporate it into securiy ruiles
> for whatever LSM you are using.

I'm surprised we're not using the SHA1 hashes or whatever we compute
for the modules to make sure we are loading the foo.ko that we expect
to be.

Then even if someone can rename every file on the system they cannot
force a rogue module to load unless they build one with a proper hash
collision.

Ie. transform module load strings at build time:

	ppp.ko	--> ppp.ko:SHA1

Or something like that.  And the kernel refuses to load a ppp.ko
with a mismatching SHA1.

All of this capability stuff seems to dance a circle around the
problem rather than fix it.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-29 14:50               ` David Miller
  0 siblings, 0 replies; 266+ messages in thread
From: David Miller @ 2017-11-29 14:50 UTC (permalink / raw)
  To: linux-security-module

From: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Date: Wed, 29 Nov 2017 13:46:12 +0000

> I really don't care what the module loading rules end up with and
> whether we add CAP_SYS_YET_ANOTHER_MEANINGLESS_FLAG but what is
> actually needed is to properly incorporate it into securiy ruiles
> for whatever LSM you are using.

I'm surprised we're not using the SHA1 hashes or whatever we compute
for the modules to make sure we are loading the foo.ko that we expect
to be.

Then even if someone can rename every file on the system they cannot
force a rogue module to load unless they build one with a proper hash
collision.

Ie. transform module load strings at build time:

	ppp.ko	--> ppp.ko:SHA1

Or something like that.  And the kernel refuses to load a ppp.ko
with a mismatching SHA1.

All of this capability stuff seems to dance a circle around the
problem rather than fix it.
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-29 14:50               ` David Miller
  0 siblings, 0 replies; 266+ messages in thread
From: David Miller @ 2017-11-29 14:50 UTC (permalink / raw)
  To: gnomes
  Cc: keescook, mcgrof, tixxdz, luto, akpm, james.l.morris,
	ben.hutchings, solar, serge, jeyu, rusty, linux-kernel,
	linux-security-module, kernel-hardening, corbet, mingo, netdev,
	peterz, torvalds

From: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Date: Wed, 29 Nov 2017 13:46:12 +0000

> I really don't care what the module loading rules end up with and
> whether we add CAP_SYS_YET_ANOTHER_MEANINGLESS_FLAG but what is
> actually needed is to properly incorporate it into securiy ruiles
> for whatever LSM you are using.

I'm surprised we're not using the SHA1 hashes or whatever we compute
for the modules to make sure we are loading the foo.ko that we expect
to be.

Then even if someone can rename every file on the system they cannot
force a rogue module to load unless they build one with a proper hash
collision.

Ie. transform module load strings at build time:

	ppp.ko	--> ppp.ko:SHA1

Or something like that.  And the kernel refuses to load a ppp.ko
with a mismatching SHA1.

All of this capability stuff seems to dance a circle around the
problem rather than fix it.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-29  0:17                           ` Linus Torvalds
  (?)
@ 2017-11-29 15:28                             ` Geo Kozey
  -1 siblings, 0 replies; 266+ messages in thread
From: Geo Kozey @ 2017-11-29 15:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: LSM List, Linux Kernel Mailing List, kernel-hardening

> From: Linus Torvalds <torvalds@linux-foundation.org>
> Sent: Wed Nov 29 01:17:05 CET 2017
> To: Geo Kozey <geokozey@mailfence.com>
> Subject: Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
> 
> 
> On Tue, Nov 28, 2017 at 3:51 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > So a patch that avoids breaking existing users, but also doesn't
> > actually improve anything for existing users, simply shouldn't be part
> > of the mainline kernel.
> 
> Just to clarify: maybe it ends up being truly impossible to make the
> default be more restrictive. The default certainly won't be the most
> restrictive option.
> 
> But at the same time, if people can't even be bothered to try to
> improve the general case, and only do things that you have to opt in
> for, it really isn't much of an improvement. We had this whole
> "opt-in" discussion for another thread entirely, and it basically
> didn't improve anything for anybody for the better half of a decade.
> 
> Hardening that only works for special cases isn't hardening at all. It
> will just mean that 99+% of all kernel developers won't see the
> fallout at all.
> 
> Yes, something like Android may be 99% of the devices, but it's a very
> small portion of the core developer base because the hardware is all
> locked down, and it's an even smaller portion of the usage patterns.
> 
> So I can see some people say "We can use this for android and protect
> the 99%". But if it then is basically invisible to the rest of the
> user base, it means that all those servers etc aren't getting the kind
> of protection they should have.
> 
> Just to take that DCCP thing as an example: being a module wasn't what
> made it vulnerable. It would have been vulnerable compiled in too.
> What made it vulnerable was that the DCCP code simply isn't widely
> enough used and tested, and basically barely on life support. And it
> was available much too widely despite that.
> 
> So all this is about is to make for a smaller attack surface.
> 
> But if it turns out that we can make the attack surface smaller by
> simply white-listing a few modules that we know are actively used and
> feel better about the quality of, that makes for a much smaller attack
> surface _too_. And it does so in general, without having to set some
> flag that 99% of all MIS people won't even really know about.
> 
> So that's why I want people to look at a different approach. Yes, the
> opt-in model means that by default nothing changes. That protects
> against the whole "oops, we don't break user space". But it has a huge
> downside.
> 
> The model that I am a proponent of is to take a softer approach
> initially: don't forbid module loading (because that breaks users),
> but instead _warn_ about non-root module loading. And then we can
> start fixing the cases that we find.
> 
> See? This is *exactly* the same thing that the user-mode access thing
> was about. Hardening people need to get over their "hard rules"
> mindset. We don't kill processes or forbid them from doing things that
> might be bad. We start by warning about them, to see what "might be
> bad" cases are actually normal, and not actually bad at all. And then
> we use that information to guide our notion of what should actually
> trigger a stronger response.
> 
>                    Linus

I followed your last week discussion and I agreed with your arguments so
it's unfortunate for me that You put me on the opposite side. I agree that
improving kernel code by finding and fixing existing bugs is the ultimate
goal for security features.

On the other hand the whole debugging process (find, report, wait for fix)
takes a lot of time and work and is basically never ending story. That's why
some kind of prevention which protects users in-between is important for
users which priorities maximal security on their systems.

I like your idea about printing warnings. What about:

modules_autoload_mode =0 (default)
_warn_ about non-root module loading and allow

modules_autoload_mode =1
_warn_ about non-root module loading and deny

This way we can start fixing found cases and also give users option for proactive
defense at the same time without breaking general public. The end game (after
a lot of fixes) would be to make 0 and 1 mostly indistinguishable from security
perspective. 

Yours sincerely

G. K.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29 15:28                             ` Geo Kozey
  0 siblings, 0 replies; 266+ messages in thread
From: Geo Kozey @ 2017-11-29 15:28 UTC (permalink / raw)
  To: linux-security-module

> From: Linus Torvalds <torvalds@linux-foundation.org>
> Sent: Wed Nov 29 01:17:05 CET 2017
> To: Geo Kozey <geokozey@mailfence.com>
> Subject: Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
> 
> 
> On Tue, Nov 28, 2017 at 3:51 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > So a patch that avoids breaking existing users, but also doesn't
> > actually improve anything for existing users, simply shouldn't be part
> > of the mainline kernel.
> 
> Just to clarify: maybe it ends up being truly impossible to make the
> default be more restrictive. The default certainly won't be the most
> restrictive option.
> 
> But at the same time, if people can't even be bothered to try to
> improve the general case, and only do things that you have to opt in
> for, it really isn't much of an improvement. We had this whole
> "opt-in" discussion for another thread entirely, and it basically
> didn't improve anything for anybody for the better half of a decade.
> 
> Hardening that only works for special cases isn't hardening at all. It
> will just mean that 99+% of all kernel developers won't see the
> fallout at all.
> 
> Yes, something like Android may be 99% of the devices, but it's a very
> small portion of the core developer base because the hardware is all
> locked down, and it's an even smaller portion of the usage patterns.
> 
> So I can see some people say "We can use this for android and protect
> the 99%". But if it then is basically invisible to the rest of the
> user base, it means that all those servers etc aren't getting the kind
> of protection they should have.
> 
> Just to take that DCCP thing as an example: being a module wasn't what
> made it vulnerable. It would have been vulnerable compiled in too.
> What made it vulnerable was that the DCCP code simply isn't widely
> enough used and tested, and basically barely on life support. And it
> was available much too widely despite that.
> 
> So all this is about is to make for a smaller attack surface.
> 
> But if it turns out that we can make the attack surface smaller by
> simply white-listing a few modules that we know are actively used and
> feel better about the quality of, that makes for a much smaller attack
> surface _too_. And it does so in general, without having to set some
> flag that 99% of all MIS people won't even really know about.
> 
> So that's why I want people to look at a different approach. Yes, the
> opt-in model means that by default nothing changes. That protects
> against the whole "oops, we don't break user space". But it has a huge
> downside.
> 
> The model that I am a proponent of is to take a softer approach
> initially: don't forbid module loading (because that breaks users),
> but instead _warn_ about non-root module loading. And then we can
> start fixing the cases that we find.
> 
> See? This is *exactly* the same thing that the user-mode access thing
> was about. Hardening people need to get over their "hard rules"
> mindset. We don't kill processes or forbid them from doing things that
> might be bad. We start by warning about them, to see what "might be
> bad" cases are actually normal, and not actually bad at all. And then
> we use that information to guide our notion of what should actually
> trigger a stronger response.
> 
>                    Linus

I followed your last week discussion and I agreed with your arguments so
it's unfortunate for me that You put me on the opposite side. I agree that
improving kernel code by finding and fixing existing bugs is the ultimate
goal for security features.

On the other hand the whole debugging process (find, report, wait for fix)
takes a lot of time and work and is basically never ending story. That's why
some kind of prevention which protects users in-between is important for
users which priorities maximal security on their systems.

I like your idea about printing warnings. What about:

modules_autoload_mode =0 (default)
_warn_ about non-root module loading and allow

modules_autoload_mode =1
_warn_ about non-root module loading and deny

This way we can start fixing found cases and also give users option for proactive
defense at the same time without breaking general public. The end game (after
a lot of fixes) would be to make 0 and 1 mostly indistinguishable from security
perspective. 

Yours sincerely

G. K.
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29 15:28                             ` Geo Kozey
  0 siblings, 0 replies; 266+ messages in thread
From: Geo Kozey @ 2017-11-29 15:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: LSM List, Linux Kernel Mailing List, kernel-hardening

> From: Linus Torvalds <torvalds@linux-foundation.org>
> Sent: Wed Nov 29 01:17:05 CET 2017
> To: Geo Kozey <geokozey@mailfence.com>
> Subject: Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
> 
> 
> On Tue, Nov 28, 2017 at 3:51 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > So a patch that avoids breaking existing users, but also doesn't
> > actually improve anything for existing users, simply shouldn't be part
> > of the mainline kernel.
> 
> Just to clarify: maybe it ends up being truly impossible to make the
> default be more restrictive. The default certainly won't be the most
> restrictive option.
> 
> But at the same time, if people can't even be bothered to try to
> improve the general case, and only do things that you have to opt in
> for, it really isn't much of an improvement. We had this whole
> "opt-in" discussion for another thread entirely, and it basically
> didn't improve anything for anybody for the better half of a decade.
> 
> Hardening that only works for special cases isn't hardening at all. It
> will just mean that 99+% of all kernel developers won't see the
> fallout at all.
> 
> Yes, something like Android may be 99% of the devices, but it's a very
> small portion of the core developer base because the hardware is all
> locked down, and it's an even smaller portion of the usage patterns.
> 
> So I can see some people say "We can use this for android and protect
> the 99%". But if it then is basically invisible to the rest of the
> user base, it means that all those servers etc aren't getting the kind
> of protection they should have.
> 
> Just to take that DCCP thing as an example: being a module wasn't what
> made it vulnerable. It would have been vulnerable compiled in too.
> What made it vulnerable was that the DCCP code simply isn't widely
> enough used and tested, and basically barely on life support. And it
> was available much too widely despite that.
> 
> So all this is about is to make for a smaller attack surface.
> 
> But if it turns out that we can make the attack surface smaller by
> simply white-listing a few modules that we know are actively used and
> feel better about the quality of, that makes for a much smaller attack
> surface _too_. And it does so in general, without having to set some
> flag that 99% of all MIS people won't even really know about.
> 
> So that's why I want people to look at a different approach. Yes, the
> opt-in model means that by default nothing changes. That protects
> against the whole "oops, we don't break user space". But it has a huge
> downside.
> 
> The model that I am a proponent of is to take a softer approach
> initially: don't forbid module loading (because that breaks users),
> but instead _warn_ about non-root module loading. And then we can
> start fixing the cases that we find.
> 
> See? This is *exactly* the same thing that the user-mode access thing
> was about. Hardening people need to get over their "hard rules"
> mindset. We don't kill processes or forbid them from doing things that
> might be bad. We start by warning about them, to see what "might be
> bad" cases are actually normal, and not actually bad at all. And then
> we use that information to guide our notion of what should actually
> trigger a stronger response.
> 
>                    Linus

I followed your last week discussion and I agreed with your arguments so
it's unfortunate for me that You put me on the opposite side. I agree that
improving kernel code by finding and fixing existing bugs is the ultimate
goal for security features.

On the other hand the whole debugging process (find, report, wait for fix)
takes a lot of time and work and is basically never ending story. That's why
some kind of prevention which protects users in-between is important for
users which priorities maximal security on their systems.

I like your idea about printing warnings. What about:

modules_autoload_mode =0 (default)
_warn_ about non-root module loading and allow

modules_autoload_mode =1
_warn_ about non-root module loading and deny

This way we can start fixing found cases and also give users option for proactive
defense at the same time without breaking general public. The end game (after
a lot of fixes) would be to make 0 and 1 mostly indistinguishable from security
perspective. 

Yours sincerely

G. K.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-29 14:50               ` David Miller
  (?)
@ 2017-11-29 15:54                 ` Theodore Ts'o
  -1 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-29 15:54 UTC (permalink / raw)
  To: David Miller
  Cc: gnomes, keescook, mcgrof, tixxdz, luto, akpm, james.l.morris,
	ben.hutchings, solar, serge, jeyu, rusty, linux-kernel,
	linux-security-module, kernel-hardening, corbet, mingo, netdev,
	peterz, torvalds

On Wed, Nov 29, 2017 at 09:50:14AM -0500, David Miller wrote:
> From: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
> Date: Wed, 29 Nov 2017 13:46:12 +0000
> 
> > I really don't care what the module loading rules end up with and
> > whether we add CAP_SYS_YET_ANOTHER_MEANINGLESS_FLAG but what is
> > actually needed is to properly incorporate it into securiy ruiles
> > for whatever LSM you are using.
> 
> I'm surprised we're not using the SHA1 hashes or whatever we compute
> for the modules to make sure we are loading the foo.ko that we expect
> to be.

We do have signed modules.  But this won't help us if the user is
using a distro kernel which has compiled some module which is known to
be unmaintained which everyone in the know *expects* to have 0-day
bugs, such as DCCP.  That's because the DCCP module is signed.

We could fix this by adding to the signature used for module signing
to include the module name, so that the bad guy can't rename dccp.ko
to be ppp.ko, I suppose....

> All of this capability stuff seems to dance a circle around the
> problem rather than fix it.

Half the problem here is that with containers, people are changing the
security model, because they want to let untrusted users have "root",
without really having "root".  Part of the fundamental problem is that
there are some well-meaning, but fundamentally misguided people, who
have been asserting: "Containers are just as secure as VM's".

Well, they are not.  And the sooner people get past this, the better
off they'll be....

						- Ted

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-29 15:54                 ` Theodore Ts'o
  0 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-29 15:54 UTC (permalink / raw)
  To: linux-security-module

On Wed, Nov 29, 2017 at 09:50:14AM -0500, David Miller wrote:
> From: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
> Date: Wed, 29 Nov 2017 13:46:12 +0000
> 
> > I really don't care what the module loading rules end up with and
> > whether we add CAP_SYS_YET_ANOTHER_MEANINGLESS_FLAG but what is
> > actually needed is to properly incorporate it into securiy ruiles
> > for whatever LSM you are using.
> 
> I'm surprised we're not using the SHA1 hashes or whatever we compute
> for the modules to make sure we are loading the foo.ko that we expect
> to be.

We do have signed modules.  But this won't help us if the user is
using a distro kernel which has compiled some module which is known to
be unmaintained which everyone in the know *expects* to have 0-day
bugs, such as DCCP.  That's because the DCCP module is signed.

We could fix this by adding to the signature used for module signing
to include the module name, so that the bad guy can't rename dccp.ko
to be ppp.ko, I suppose....

> All of this capability stuff seems to dance a circle around the
> problem rather than fix it.

Half the problem here is that with containers, people are changing the
security model, because they want to let untrusted users have "root",
without really having "root".  Part of the fundamental problem is that
there are some well-meaning, but fundamentally misguided people, who
have been asserting: "Containers are just as secure as VM's".

Well, they are not.  And the sooner people get past this, the better
off they'll be....

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-29 15:54                 ` Theodore Ts'o
  0 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-29 15:54 UTC (permalink / raw)
  To: David Miller
  Cc: gnomes, keescook, mcgrof, tixxdz, luto, akpm, james.l.morris,
	ben.hutchings, solar, serge, jeyu, rusty, linux-kernel,
	linux-security-module, kernel-hardening, corbet, mingo, netdev,
	peterz, torvalds

On Wed, Nov 29, 2017 at 09:50:14AM -0500, David Miller wrote:
> From: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
> Date: Wed, 29 Nov 2017 13:46:12 +0000
> 
> > I really don't care what the module loading rules end up with and
> > whether we add CAP_SYS_YET_ANOTHER_MEANINGLESS_FLAG but what is
> > actually needed is to properly incorporate it into securiy ruiles
> > for whatever LSM you are using.
> 
> I'm surprised we're not using the SHA1 hashes or whatever we compute
> for the modules to make sure we are loading the foo.ko that we expect
> to be.

We do have signed modules.  But this won't help us if the user is
using a distro kernel which has compiled some module which is known to
be unmaintained which everyone in the know *expects* to have 0-day
bugs, such as DCCP.  That's because the DCCP module is signed.

We could fix this by adding to the signature used for module signing
to include the module name, so that the bad guy can't rename dccp.ko
to be ppp.ko, I suppose....

> All of this capability stuff seems to dance a circle around the
> problem rather than fix it.

Half the problem here is that with containers, people are changing the
security model, because they want to let untrusted users have "root",
without really having "root".  Part of the fundamental problem is that
there are some well-meaning, but fundamentally misguided people, who
have been asserting: "Containers are just as secure as VM's".

Well, they are not.  And the sooner people get past this, the better
off they'll be....

						- Ted

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-29 15:54                 ` Theodore Ts'o
  (?)
@ 2017-11-29 15:58                   ` David Miller
  -1 siblings, 0 replies; 266+ messages in thread
From: David Miller @ 2017-11-29 15:58 UTC (permalink / raw)
  To: tytso
  Cc: gnomes, keescook, mcgrof, tixxdz, luto, akpm, james.l.morris,
	ben.hutchings, solar, serge, jeyu, rusty, linux-kernel,
	linux-security-module, kernel-hardening, corbet, mingo, netdev,
	peterz, torvalds

From: Theodore Ts'o <tytso@mit.edu>
Date: Wed, 29 Nov 2017 10:54:06 -0500

> On Wed, Nov 29, 2017 at 09:50:14AM -0500, David Miller wrote:
>> From: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
>> Date: Wed, 29 Nov 2017 13:46:12 +0000
>> 
>> > I really don't care what the module loading rules end up with and
>> > whether we add CAP_SYS_YET_ANOTHER_MEANINGLESS_FLAG but what is
>> > actually needed is to properly incorporate it into securiy ruiles
>> > for whatever LSM you are using.
>> 
>> I'm surprised we're not using the SHA1 hashes or whatever we compute
>> for the modules to make sure we are loading the foo.ko that we expect
>> to be.
> 
> We do have signed modules.  But this won't help us if the user is
> using a distro kernel which has compiled some module which is known to
> be unmaintained which everyone in the know *expects* to have 0-day
> bugs, such as DCCP.  That's because the DCCP module is signed.

That's not what we're talking about.

We're talking about making sure that loading "ppp.ko" really gets
ppp.ko rather than some_other_module.ko renamed to ppp.ko via some
other mechanism.

Both modules have legitimate signatures so the kernel will happily
load both.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-29 15:58                   ` David Miller
  0 siblings, 0 replies; 266+ messages in thread
From: David Miller @ 2017-11-29 15:58 UTC (permalink / raw)
  To: linux-security-module

From: Theodore Ts'o <tytso@mit.edu>
Date: Wed, 29 Nov 2017 10:54:06 -0500

> On Wed, Nov 29, 2017 at 09:50:14AM -0500, David Miller wrote:
>> From: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
>> Date: Wed, 29 Nov 2017 13:46:12 +0000
>> 
>> > I really don't care what the module loading rules end up with and
>> > whether we add CAP_SYS_YET_ANOTHER_MEANINGLESS_FLAG but what is
>> > actually needed is to properly incorporate it into securiy ruiles
>> > for whatever LSM you are using.
>> 
>> I'm surprised we're not using the SHA1 hashes or whatever we compute
>> for the modules to make sure we are loading the foo.ko that we expect
>> to be.
> 
> We do have signed modules.  But this won't help us if the user is
> using a distro kernel which has compiled some module which is known to
> be unmaintained which everyone in the know *expects* to have 0-day
> bugs, such as DCCP.  That's because the DCCP module is signed.

That's not what we're talking about.

We're talking about making sure that loading "ppp.ko" really gets
ppp.ko rather than some_other_module.ko renamed to ppp.ko via some
other mechanism.

Both modules have legitimate signatures so the kernel will happily
load both.

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-29 15:58                   ` David Miller
  0 siblings, 0 replies; 266+ messages in thread
From: David Miller @ 2017-11-29 15:58 UTC (permalink / raw)
  To: tytso
  Cc: gnomes, keescook, mcgrof, tixxdz, luto, akpm, james.l.morris,
	ben.hutchings, solar, serge, jeyu, rusty, linux-kernel,
	linux-security-module, kernel-hardening, corbet, mingo, netdev,
	peterz, torvalds

From: Theodore Ts'o <tytso@mit.edu>
Date: Wed, 29 Nov 2017 10:54:06 -0500

> On Wed, Nov 29, 2017 at 09:50:14AM -0500, David Miller wrote:
>> From: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
>> Date: Wed, 29 Nov 2017 13:46:12 +0000
>> 
>> > I really don't care what the module loading rules end up with and
>> > whether we add CAP_SYS_YET_ANOTHER_MEANINGLESS_FLAG but what is
>> > actually needed is to properly incorporate it into securiy ruiles
>> > for whatever LSM you are using.
>> 
>> I'm surprised we're not using the SHA1 hashes or whatever we compute
>> for the modules to make sure we are loading the foo.ko that we expect
>> to be.
> 
> We do have signed modules.  But this won't help us if the user is
> using a distro kernel which has compiled some module which is known to
> be unmaintained which everyone in the know *expects* to have 0-day
> bugs, such as DCCP.  That's because the DCCP module is signed.

That's not what we're talking about.

We're talking about making sure that loading "ppp.ko" really gets
ppp.ko rather than some_other_module.ko renamed to ppp.ko via some
other mechanism.

Both modules have legitimate signatures so the kernel will happily
load both.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-29 15:58                   ` David Miller
  (?)
@ 2017-11-29 16:29                     ` Theodore Ts'o
  -1 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-29 16:29 UTC (permalink / raw)
  To: David Miller
  Cc: gnomes, keescook, mcgrof, tixxdz, luto, akpm, james.l.morris,
	ben.hutchings, solar, serge, jeyu, rusty, linux-kernel,
	linux-security-module, kernel-hardening, corbet, mingo, netdev,
	peterz, torvalds

On Wed, Nov 29, 2017 at 10:58:16AM -0500, David Miller wrote:
> That's not what we're talking about.
> 
> We're talking about making sure that loading "ppp.ko" really gets
> ppp.ko rather than some_other_module.ko renamed to ppp.ko via some
> other mechanism.

Right, and the best solution to this problem is to include in the
signature the name of the module.  (One way of doing this is adding
the module name into the cryptographic checksum which is then
digitally signed by the kernel module key; we would have to bump the
version number of the signature format, obviously.)  This binds the
name of the module into the digital signature so that it can't be
renamed.

						- Ted

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-29 16:29                     ` Theodore Ts'o
  0 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-29 16:29 UTC (permalink / raw)
  To: linux-security-module

On Wed, Nov 29, 2017 at 10:58:16AM -0500, David Miller wrote:
> That's not what we're talking about.
> 
> We're talking about making sure that loading "ppp.ko" really gets
> ppp.ko rather than some_other_module.ko renamed to ppp.ko via some
> other mechanism.

Right, and the best solution to this problem is to include in the
signature the name of the module.  (One way of doing this is adding
the module name into the cryptographic checksum which is then
digitally signed by the kernel module key; we would have to bump the
version number of the signature format, obviously.)  This binds the
name of the module into the digital signature so that it can't be
renamed.

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-29 16:29                     ` Theodore Ts'o
  0 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-29 16:29 UTC (permalink / raw)
  To: David Miller
  Cc: gnomes, keescook, mcgrof, tixxdz, luto, akpm, james.l.morris,
	ben.hutchings, solar, serge, jeyu, rusty, linux-kernel,
	linux-security-module, kernel-hardening, corbet, mingo, netdev,
	peterz, torvalds

On Wed, Nov 29, 2017 at 10:58:16AM -0500, David Miller wrote:
> That's not what we're talking about.
> 
> We're talking about making sure that loading "ppp.ko" really gets
> ppp.ko rather than some_other_module.ko renamed to ppp.ko via some
> other mechanism.

Right, and the best solution to this problem is to include in the
signature the name of the module.  (One way of doing this is adding
the module name into the cryptographic checksum which is then
digitally signed by the kernel module key; we would have to bump the
version number of the signature format, obviously.)  This binds the
name of the module into the digital signature so that it can't be
renamed.

						- Ted

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-29 15:54                 ` Theodore Ts'o
  (?)
@ 2017-11-29 17:28                   ` Serge E. Hallyn
  -1 siblings, 0 replies; 266+ messages in thread
From: Serge E. Hallyn @ 2017-11-29 17:28 UTC (permalink / raw)
  To: Theodore Ts'o, David Miller, gnomes, keescook, mcgrof,
	tixxdz, luto, akpm, james.l.morris, ben.hutchings, solar, serge,
	jeyu, rusty, linux-kernel, linux-security-module,
	kernel-hardening, corbet, mingo, netdev, peterz, torvalds

Quoting Theodore Ts'o (tytso@mit.edu):
> Half the problem here is that with containers, people are changing the
> security model, because they want to let untrusted users have "root",
> without really having "root".  Part of the fundamental problem is that
> there are some well-meaning, but fundamentally misguided people, who
> have been asserting: "Containers are just as secure as VM's".
> 
> Well, they are not.  And the sooner people get past this, the better
> off they'll be....

Just to be clear, module loading requires - and must always continue to
require - CAP_SYS_MODULE against the initial user namespace.  Containers
in user namespaces do not have that.

I don't believe anyone has ever claimed that containers which are not in
a user namespace are in any way secure.

(And as for the other claim, I'd prefer to stick to "VMs are in most
cases as insecure as properly configured containers" :)

-serge

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-29 17:28                   ` Serge E. Hallyn
  0 siblings, 0 replies; 266+ messages in thread
From: Serge E. Hallyn @ 2017-11-29 17:28 UTC (permalink / raw)
  To: linux-security-module

Quoting Theodore Ts'o (tytso at mit.edu):
> Half the problem here is that with containers, people are changing the
> security model, because they want to let untrusted users have "root",
> without really having "root".  Part of the fundamental problem is that
> there are some well-meaning, but fundamentally misguided people, who
> have been asserting: "Containers are just as secure as VM's".
> 
> Well, they are not.  And the sooner people get past this, the better
> off they'll be....

Just to be clear, module loading requires - and must always continue to
require - CAP_SYS_MODULE against the initial user namespace.  Containers
in user namespaces do not have that.

I don't believe anyone has ever claimed that containers which are not in
a user namespace are in any way secure.

(And as for the other claim, I'd prefer to stick to "VMs are in most
cases as insecure as properly configured containers" :)

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-29 17:28                   ` Serge E. Hallyn
  0 siblings, 0 replies; 266+ messages in thread
From: Serge E. Hallyn @ 2017-11-29 17:28 UTC (permalink / raw)
  To: Theodore Ts'o, David Miller, gnomes, keescook, mcgrof,
	tixxdz, luto, akpm, james.l.morris, ben.hutchings, solar, serge,
	jeyu, rusty, linux-kernel, linux-security-module,
	kernel-hardening, corbet, mingo, netdev, peterz, torvalds

Quoting Theodore Ts'o (tytso@mit.edu):
> Half the problem here is that with containers, people are changing the
> security model, because they want to let untrusted users have "root",
> without really having "root".  Part of the fundamental problem is that
> there are some well-meaning, but fundamentally misguided people, who
> have been asserting: "Containers are just as secure as VM's".
> 
> Well, they are not.  And the sooner people get past this, the better
> off they'll be....

Just to be clear, module loading requires - and must always continue to
require - CAP_SYS_MODULE against the initial user namespace.  Containers
in user namespaces do not have that.

I don't believe anyone has ever claimed that containers which are not in
a user namespace are in any way secure.

(And as for the other claim, I'd prefer to stick to "VMs are in most
cases as insecure as properly configured containers" :)

-serge

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-29  0:50                               ` Linus Torvalds
  (?)
@ 2017-11-29 18:30                                 ` Kees Cook
  -1 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-29 18:30 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Geo Kozey, LSM List, Linux Kernel Mailing List, kernel-hardening

On Tue, Nov 28, 2017 at 4:50 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Nov 28, 2017 at 4:26 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>>> The model that I am a proponent of is to take a softer approach
>>> initially: don't forbid module loading (because that breaks users),
>>> but instead _warn_ about non-root module loading. And then we can
>>> start fixing the cases that we find.
>>
>> I am totally fine with this. The question I'm hoping to have answered
>> is, "then what?" We already have concrete examples of module
>> autoloading that will still be need to stay unprivileged and as-is in
>> the kernel (even if we remove others). What do you see as the way to
>> allow an admin to turn those off?
>
> Just thinking about the DCCP case, where networking people actually
> knew it was pretty deprecated and had no real maintainer, I think one
> thing to look at would be simply a per-module flag.
>
> That kind of thing should be fairly easy to implement, along the same
> lines as the module license - it just sets a flag in the ELF section
> headers.
>
> With something like that, we literally could make the default be "no
> autoloading except for root", and then just mark the modules that we
> think are ok and well maintained.
>
> Sure, if you then do a lock-down mode that makes that flag parsing
> stricter, then that's a separate thing. But I suspect we definitely
> could be a lot stricter on a per-module basis, and do it in a way
> where a normal user wouldn't even notice that we've limited the
> autoloading.
>
> But the first step would be to just add some noise. And even with the
> per-module flag, at first it would only suppress the noise (ie we'd
> still _allow_ loading other modules, they'd just be noisy). Then, if
> nobody hollers, maybe the next kernel release we'll make it actually
> enforce the flag.
>
> Does that sound reasonable?

Yeah, I think I see the way forward here; thanks for the discussion!

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29 18:30                                 ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-29 18:30 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 4:50 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Nov 28, 2017 at 4:26 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>>> The model that I am a proponent of is to take a softer approach
>>> initially: don't forbid module loading (because that breaks users),
>>> but instead _warn_ about non-root module loading. And then we can
>>> start fixing the cases that we find.
>>
>> I am totally fine with this. The question I'm hoping to have answered
>> is, "then what?" We already have concrete examples of module
>> autoloading that will still be need to stay unprivileged and as-is in
>> the kernel (even if we remove others). What do you see as the way to
>> allow an admin to turn those off?
>
> Just thinking about the DCCP case, where networking people actually
> knew it was pretty deprecated and had no real maintainer, I think one
> thing to look at would be simply a per-module flag.
>
> That kind of thing should be fairly easy to implement, along the same
> lines as the module license - it just sets a flag in the ELF section
> headers.
>
> With something like that, we literally could make the default be "no
> autoloading except for root", and then just mark the modules that we
> think are ok and well maintained.
>
> Sure, if you then do a lock-down mode that makes that flag parsing
> stricter, then that's a separate thing. But I suspect we definitely
> could be a lot stricter on a per-module basis, and do it in a way
> where a normal user wouldn't even notice that we've limited the
> autoloading.
>
> But the first step would be to just add some noise. And even with the
> per-module flag, at first it would only suppress the noise (ie we'd
> still _allow_ loading other modules, they'd just be noisy). Then, if
> nobody hollers, maybe the next kernel release we'll make it actually
> enforce the flag.
>
> Does that sound reasonable?

Yeah, I think I see the way forward here; thanks for the discussion!

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29 18:30                                 ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-29 18:30 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Geo Kozey, LSM List, Linux Kernel Mailing List, kernel-hardening

On Tue, Nov 28, 2017 at 4:50 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Nov 28, 2017 at 4:26 PM, Kees Cook <keescook@chromium.org> wrote:
>>
>>> The model that I am a proponent of is to take a softer approach
>>> initially: don't forbid module loading (because that breaks users),
>>> but instead _warn_ about non-root module loading. And then we can
>>> start fixing the cases that we find.
>>
>> I am totally fine with this. The question I'm hoping to have answered
>> is, "then what?" We already have concrete examples of module
>> autoloading that will still be need to stay unprivileged and as-is in
>> the kernel (even if we remove others). What do you see as the way to
>> allow an admin to turn those off?
>
> Just thinking about the DCCP case, where networking people actually
> knew it was pretty deprecated and had no real maintainer, I think one
> thing to look at would be simply a per-module flag.
>
> That kind of thing should be fairly easy to implement, along the same
> lines as the module license - it just sets a flag in the ELF section
> headers.
>
> With something like that, we literally could make the default be "no
> autoloading except for root", and then just mark the modules that we
> think are ok and well maintained.
>
> Sure, if you then do a lock-down mode that makes that flag parsing
> stricter, then that's a separate thing. But I suspect we definitely
> could be a lot stricter on a per-module basis, and do it in a way
> where a normal user wouldn't even notice that we've limited the
> autoloading.
>
> But the first step would be to just add some noise. And even with the
> per-module flag, at first it would only suppress the noise (ie we'd
> still _allow_ loading other modules, they'd just be noisy). Then, if
> nobody hollers, maybe the next kernel release we'll make it actually
> enforce the flag.
>
> Does that sound reasonable?

Yeah, I think I see the way forward here; thanks for the discussion!

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-29 18:30                                 ` Kees Cook
  (?)
@ 2017-11-29 18:46                                   ` Linus Torvalds
  -1 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29 18:46 UTC (permalink / raw)
  To: Kees Cook
  Cc: Geo Kozey, LSM List, Linux Kernel Mailing List, kernel-hardening

On Wed, Nov 29, 2017 at 10:30 AM, Kees Cook <keescook@chromium.org> wrote:
> On Tue, Nov 28, 2017 at 4:50 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> Just thinking about the DCCP case, where networking people actually
>> knew it was pretty deprecated and had no real maintainer, I think one
>> thing to look at would be simply a per-module flag.
>>   [ ... ]
>> Does that sound reasonable?
>
> Yeah, I think I see the way forward here; thanks for the discussion!

Note: I don't want to really force that per-module flag if it ends up
being painful, I was really just thinking that considering the DCCP
case, it would be (I think) a really nice solution.

In particular, request_module() ends up having that indirection
through usermode-helper, which makes it potentially very inconvenient
to store the "did the original caller have proper capabilities" and
then check it at actual module load time.

So the module flag is technically easy to add, and it's technically
easy to read at module loading time, but I suspect that it's actually
annoyingly hard to pass the original request_module() capability
information around to where we actually read the module.

That's why it might be _much_ easier to try to do it per call-site
instead. It's not quite as fine-grained (because several call-sites do
things like

        request_module("net-pf-%d-proto-%d", PF_NETLINK, protocol);

that can load a large number of different modules), but if we can get
away with just saying "this particular callsite is ok, because it only
loads the bluetooth module that we thing is trustworthy" or "this
call-site is ok, because you already had to have access to the device
node", that is going to be much simpler and more straightforward.

In other words: I think there are at least two different models to go
after, and there may be practical reasons to prefer one over the
other.

                    Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29 18:46                                   ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29 18:46 UTC (permalink / raw)
  To: linux-security-module

On Wed, Nov 29, 2017 at 10:30 AM, Kees Cook <keescook@chromium.org> wrote:
> On Tue, Nov 28, 2017 at 4:50 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> Just thinking about the DCCP case, where networking people actually
>> knew it was pretty deprecated and had no real maintainer, I think one
>> thing to look at would be simply a per-module flag.
>>   [ ... ]
>> Does that sound reasonable?
>
> Yeah, I think I see the way forward here; thanks for the discussion!

Note: I don't want to really force that per-module flag if it ends up
being painful, I was really just thinking that considering the DCCP
case, it would be (I think) a really nice solution.

In particular, request_module() ends up having that indirection
through usermode-helper, which makes it potentially very inconvenient
to store the "did the original caller have proper capabilities" and
then check it at actual module load time.

So the module flag is technically easy to add, and it's technically
easy to read at module loading time, but I suspect that it's actually
annoyingly hard to pass the original request_module() capability
information around to where we actually read the module.

That's why it might be _much_ easier to try to do it per call-site
instead. It's not quite as fine-grained (because several call-sites do
things like

        request_module("net-pf-%d-proto-%d", PF_NETLINK, protocol);

that can load a large number of different modules), but if we can get
away with just saying "this particular callsite is ok, because it only
loads the bluetooth module that we thing is trustworthy" or "this
call-site is ok, because you already had to have access to the device
node", that is going to be much simpler and more straightforward.

In other words: I think there are at least two different models to go
after, and there may be practical reasons to prefer one over the
other.

                    Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29 18:46                                   ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29 18:46 UTC (permalink / raw)
  To: Kees Cook
  Cc: Geo Kozey, LSM List, Linux Kernel Mailing List, kernel-hardening

On Wed, Nov 29, 2017 at 10:30 AM, Kees Cook <keescook@chromium.org> wrote:
> On Tue, Nov 28, 2017 at 4:50 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> Just thinking about the DCCP case, where networking people actually
>> knew it was pretty deprecated and had no real maintainer, I think one
>> thing to look at would be simply a per-module flag.
>>   [ ... ]
>> Does that sound reasonable?
>
> Yeah, I think I see the way forward here; thanks for the discussion!

Note: I don't want to really force that per-module flag if it ends up
being painful, I was really just thinking that considering the DCCP
case, it would be (I think) a really nice solution.

In particular, request_module() ends up having that indirection
through usermode-helper, which makes it potentially very inconvenient
to store the "did the original caller have proper capabilities" and
then check it at actual module load time.

So the module flag is technically easy to add, and it's technically
easy to read at module loading time, but I suspect that it's actually
annoyingly hard to pass the original request_module() capability
information around to where we actually read the module.

That's why it might be _much_ easier to try to do it per call-site
instead. It's not quite as fine-grained (because several call-sites do
things like

        request_module("net-pf-%d-proto-%d", PF_NETLINK, protocol);

that can load a large number of different modules), but if we can get
away with just saying "this particular callsite is ok, because it only
loads the bluetooth module that we thing is trustworthy" or "this
call-site is ok, because you already had to have access to the device
node", that is going to be much simpler and more straightforward.

In other words: I think there are at least two different models to go
after, and there may be practical reasons to prefer one over the
other.

                    Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-29 18:46                                   ` Linus Torvalds
  (?)
@ 2017-11-29 18:53                                     ` Linus Torvalds
  -1 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29 18:53 UTC (permalink / raw)
  To: Kees Cook
  Cc: Geo Kozey, LSM List, Linux Kernel Mailing List, kernel-hardening

On Wed, Nov 29, 2017 at 10:46 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So the module flag is technically easy to add, and it's technically
> easy to read at module loading time, but I suspect that it's actually
> annoyingly hard to pass the original request_module() capability
> information around to where we actually read the module.

One possibly interesting approach would be to run the usermode helper
not as root, but with the credentials of the request_module() caller.

That's arguably the right thing to do (in that request_module() would
never do anything that the user wouldn't be able to do on their own)
and probably what we should have done originally, but while it feels
like a nice solution I suspect it would break pretty much every distro
out there.

Because they all expect modprobe/kmod to be called as root in the
original init-namespace.

                 Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29 18:53                                     ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29 18:53 UTC (permalink / raw)
  To: linux-security-module

On Wed, Nov 29, 2017 at 10:46 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So the module flag is technically easy to add, and it's technically
> easy to read at module loading time, but I suspect that it's actually
> annoyingly hard to pass the original request_module() capability
> information around to where we actually read the module.

One possibly interesting approach would be to run the usermode helper
not as root, but with the credentials of the request_module() caller.

That's arguably the right thing to do (in that request_module() would
never do anything that the user wouldn't be able to do on their own)
and probably what we should have done originally, but while it feels
like a nice solution I suspect it would break pretty much every distro
out there.

Because they all expect modprobe/kmod to be called as root in the
original init-namespace.

                 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29 18:53                                     ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29 18:53 UTC (permalink / raw)
  To: Kees Cook
  Cc: Geo Kozey, LSM List, Linux Kernel Mailing List, kernel-hardening

On Wed, Nov 29, 2017 at 10:46 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So the module flag is technically easy to add, and it's technically
> easy to read at module loading time, but I suspect that it's actually
> annoyingly hard to pass the original request_module() capability
> information around to where we actually read the module.

One possibly interesting approach would be to run the usermode helper
not as root, but with the credentials of the request_module() caller.

That's arguably the right thing to do (in that request_module() would
never do anything that the user wouldn't be able to do on their own)
and probably what we should have done originally, but while it feels
like a nice solution I suspect it would break pretty much every distro
out there.

Because they all expect modprobe/kmod to be called as root in the
original init-namespace.

                 Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-29 18:46                                   ` Linus Torvalds
  (?)
@ 2017-11-29 21:17                                     ` Kees Cook
  -1 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-29 21:17 UTC (permalink / raw)
  To: Linus Torvalds, Djalal Harouni, Jessica Yu
  Cc: LSM List, Linux Kernel Mailing List, kernel-hardening

On Wed, Nov 29, 2017 at 10:46 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, Nov 29, 2017 at 10:30 AM, Kees Cook <keescook@chromium.org> wrote:
>> On Tue, Nov 28, 2017 at 4:50 PM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>>>
>>> Just thinking about the DCCP case, where networking people actually
>>> knew it was pretty deprecated and had no real maintainer, I think one
>>> thing to look at would be simply a per-module flag.
>>>   [ ... ]
>>> Does that sound reasonable?
>>
>> Yeah, I think I see the way forward here; thanks for the discussion!
>
> Note: I don't want to really force that per-module flag if it ends up
> being painful, I was really just thinking that considering the DCCP
> case, it would be (I think) a really nice solution.
>
> In particular, request_module() ends up having that indirection
> through usermode-helper, which makes it potentially very inconvenient
> to store the "did the original caller have proper capabilities" and
> then check it at actual module load time.

Yeah, for this it seems like the permission _evaluation_ would happen
at request_module() time, and it could pass that as a module argument
to the usermode-helper, so that during load_module(), it would have
all the needed information to take an action:

- was this load considered privileged?
- is this module marked privileged-only?
- what are we configured to do in the case of mismatches?

> So the module flag is technically easy to add, and it's technically
> easy to read at module loading time, but I suspect that it's actually
> annoyingly hard to pass the original request_module() capability
> information around to where we actually read the module.

The clever solution grsecurity uses for this is to pass information as
module loading arguments (mentioned above) in call_modprobe(), and
then read them back in load_module(). This seemed like overkill to me,
but it does allow for delaying the decision on the action.

I think this would only be needed to deal with per-module markings,
though, since all other cases the evaluation and action could happen
at request_module() time. (More on this below.)

> That's why it might be _much_ easier to try to do it per call-site
> instead. It's not quite as fine-grained (because several call-sites do
> things like
>
>         request_module("net-pf-%d-proto-%d", PF_NETLINK, protocol);
>
> that can load a large number of different modules), but if we can get
> away with just saying "this particular callsite is ok, because it only
> loads the bluetooth module that we thing is trustworthy" or "this
> call-site is ok, because you already had to have access to the device
> node", that is going to be much simpler and more straightforward.

This is basically what we've historically already tried to do: limit
request_module() calls to either one or a set of const strings, or to
load with a format template. Removing all the
request_module(whatever_the_user_says) calls has (hopefully) been
completed, though sometimes weird stuff sneaks by.

The difficulty I see here is that it's not always clear which are
expected to be privileged loads or not. Even an analysis[1] of whether
or not request_module() is using a format template doesn't tell us
much. Is it an unprivileged net module like your example above or is
it some privileged device hotplug event like
request_module("ulpi:v%04xp%04x", ulpi->id.vendor, ulpi->id.product)?

> In other words: I think there are at least two different models to go
> after, and there may be practical reasons to prefer one over the
> other.

I would agree, and it's why I thought the original series was good: I
feel like we've already done the harder work of narrowing
request_module() calls by subsystem prefixes over the years, and
separating privilege with a new API didn't make sense to me. The
privileges are a state to check in the existing API, etc. But as I
said, if there's a way forward you're happy with, let's do it. It just
needs to get to the point where an admin can explicitly disable
unprivileged module autoloading with some appropriate level of
granularity.

[email merging your later reply...]
> One possibly interesting approach would be to run the usermode helper
> not as root, but with the credentials of the request_module() caller.
>
> That's arguably the right thing to do (in that request_module() would
> never do anything that the user wouldn't be able to do on their own)
> and probably what we should have done originally, but while it feels
> like a nice solution I suspect it would break pretty much every distro
> out there.
>
> Because they all expect modprobe/kmod to be called as root in the
> original init-namespace.

Dropping to the request_module() privileges for the usermode helper
would absolutely break the world. It would actually be much more
disruptive than the proposed series. :)

One of the (many) problems with module loading is that it changes the
global features available to the kernel: suddenly all existing
processes have access to whatever feature just got loaded (modulo a
syscall path to touch the code). This is a rather ugly from the
perspective of containers: unprivileged module loading can change the
kernel globally for all containers, the load gets triggered to run in
the init userspace, not the container, etc, etc. (Though loading from
the container would be nonsense too.)

So, what we have now is that the permission verification already
happens at and around the existing request_module() calls. Callers are
already checking caps within containers, for example, or isolating
which things can be loaded with subsystem format prefixes, etc. All
the loading logic is being done around the request_module() calls, and
once this is deemed "safe", we explicitly elevate privileges to
init-ns CAP_SYS_MODULE and load the module by way of the usermode
helper's call to finit_module(). As in, we have an intentional
CAP_SYS_MODULE privilege elevation for doing module loading.

What I like about the proposed request_modue_cap() is that is codifies
the specific privilege change "having this cap allows you to gain
CAP_SYS_MODULE to load this possible prefixed subset of modules",
instead of having that logic open-coded near the request_module() call
site. And we can add more of these explicit kinds of calls, but it
won't change the need for a way to say "no privilege elevations should
be allowed".

It still sounds like you'd like to see an explicit change, similar to
the proposed request_module_cap(), that identifies the privilege
expectations on a per-call-site basis. How about this plan:

1) Add request_module_cap(required_cap, module_name_prefix, fmt, fmt_args...)

2) Convert known privileged-but-not-CAP_SYS_MODULE request_module()
callers to request_module_cap(the_needed_cap, prefix, ...)

2) Convert known unprivileged callers to use request_module_cap(0, ...)

3) Add WARN_RATELIMIT for request_module() calls without
CAP_SYS_MODULE to shake out other places where request_module_cap() is
needed.

4) Adapt the original patch series to add the per-process flag that
can block privilege elevations.

This will get us the improvement of removing any unexpectedly
unprivileged request_module() call sites for everyone, and still
provide the optional process flag to mark a process as not allowed to
make change its effective privilege to load a module.

Does this sound workable?

-Kees


[1] Just some greps to see the extent of request_module() use...

We've got a bit over 400 call sites:

$ git grep request_module'\b' | wc -l
438

About a third aren't using a format string, so they're loading either
a specific module or a specific alias:

$ git grep request_module'\b' | grep -v % | grep '"' | wc -l
160

Another third use format strings to load some aliased subset:

$ git grep request_module'\b' | grep % | wc -l
154

The rest construct their strings manually, making them less easy to
trivially examine.

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29 21:17                                     ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-29 21:17 UTC (permalink / raw)
  To: linux-security-module

On Wed, Nov 29, 2017 at 10:46 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, Nov 29, 2017 at 10:30 AM, Kees Cook <keescook@chromium.org> wrote:
>> On Tue, Nov 28, 2017 at 4:50 PM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>>>
>>> Just thinking about the DCCP case, where networking people actually
>>> knew it was pretty deprecated and had no real maintainer, I think one
>>> thing to look at would be simply a per-module flag.
>>>   [ ... ]
>>> Does that sound reasonable?
>>
>> Yeah, I think I see the way forward here; thanks for the discussion!
>
> Note: I don't want to really force that per-module flag if it ends up
> being painful, I was really just thinking that considering the DCCP
> case, it would be (I think) a really nice solution.
>
> In particular, request_module() ends up having that indirection
> through usermode-helper, which makes it potentially very inconvenient
> to store the "did the original caller have proper capabilities" and
> then check it at actual module load time.

Yeah, for this it seems like the permission _evaluation_ would happen
at request_module() time, and it could pass that as a module argument
to the usermode-helper, so that during load_module(), it would have
all the needed information to take an action:

- was this load considered privileged?
- is this module marked privileged-only?
- what are we configured to do in the case of mismatches?

> So the module flag is technically easy to add, and it's technically
> easy to read at module loading time, but I suspect that it's actually
> annoyingly hard to pass the original request_module() capability
> information around to where we actually read the module.

The clever solution grsecurity uses for this is to pass information as
module loading arguments (mentioned above) in call_modprobe(), and
then read them back in load_module(). This seemed like overkill to me,
but it does allow for delaying the decision on the action.

I think this would only be needed to deal with per-module markings,
though, since all other cases the evaluation and action could happen
at request_module() time. (More on this below.)

> That's why it might be _much_ easier to try to do it per call-site
> instead. It's not quite as fine-grained (because several call-sites do
> things like
>
>         request_module("net-pf-%d-proto-%d", PF_NETLINK, protocol);
>
> that can load a large number of different modules), but if we can get
> away with just saying "this particular callsite is ok, because it only
> loads the bluetooth module that we thing is trustworthy" or "this
> call-site is ok, because you already had to have access to the device
> node", that is going to be much simpler and more straightforward.

This is basically what we've historically already tried to do: limit
request_module() calls to either one or a set of const strings, or to
load with a format template. Removing all the
request_module(whatever_the_user_says) calls has (hopefully) been
completed, though sometimes weird stuff sneaks by.

The difficulty I see here is that it's not always clear which are
expected to be privileged loads or not. Even an analysis[1] of whether
or not request_module() is using a format template doesn't tell us
much. Is it an unprivileged net module like your example above or is
it some privileged device hotplug event like
request_module("ulpi:v%04xp%04x", ulpi->id.vendor, ulpi->id.product)?

> In other words: I think there are at least two different models to go
> after, and there may be practical reasons to prefer one over the
> other.

I would agree, and it's why I thought the original series was good: I
feel like we've already done the harder work of narrowing
request_module() calls by subsystem prefixes over the years, and
separating privilege with a new API didn't make sense to me. The
privileges are a state to check in the existing API, etc. But as I
said, if there's a way forward you're happy with, let's do it. It just
needs to get to the point where an admin can explicitly disable
unprivileged module autoloading with some appropriate level of
granularity.

[email merging your later reply...]
> One possibly interesting approach would be to run the usermode helper
> not as root, but with the credentials of the request_module() caller.
>
> That's arguably the right thing to do (in that request_module() would
> never do anything that the user wouldn't be able to do on their own)
> and probably what we should have done originally, but while it feels
> like a nice solution I suspect it would break pretty much every distro
> out there.
>
> Because they all expect modprobe/kmod to be called as root in the
> original init-namespace.

Dropping to the request_module() privileges for the usermode helper
would absolutely break the world. It would actually be much more
disruptive than the proposed series. :)

One of the (many) problems with module loading is that it changes the
global features available to the kernel: suddenly all existing
processes have access to whatever feature just got loaded (modulo a
syscall path to touch the code). This is a rather ugly from the
perspective of containers: unprivileged module loading can change the
kernel globally for all containers, the load gets triggered to run in
the init userspace, not the container, etc, etc. (Though loading from
the container would be nonsense too.)

So, what we have now is that the permission verification already
happens at and around the existing request_module() calls. Callers are
already checking caps within containers, for example, or isolating
which things can be loaded with subsystem format prefixes, etc. All
the loading logic is being done around the request_module() calls, and
once this is deemed "safe", we explicitly elevate privileges to
init-ns CAP_SYS_MODULE and load the module by way of the usermode
helper's call to finit_module(). As in, we have an intentional
CAP_SYS_MODULE privilege elevation for doing module loading.

What I like about the proposed request_modue_cap() is that is codifies
the specific privilege change "having this cap allows you to gain
CAP_SYS_MODULE to load this possible prefixed subset of modules",
instead of having that logic open-coded near the request_module() call
site. And we can add more of these explicit kinds of calls, but it
won't change the need for a way to say "no privilege elevations should
be allowed".

It still sounds like you'd like to see an explicit change, similar to
the proposed request_module_cap(), that identifies the privilege
expectations on a per-call-site basis. How about this plan:

1) Add request_module_cap(required_cap, module_name_prefix, fmt, fmt_args...)

2) Convert known privileged-but-not-CAP_SYS_MODULE request_module()
callers to request_module_cap(the_needed_cap, prefix, ...)

2) Convert known unprivileged callers to use request_module_cap(0, ...)

3) Add WARN_RATELIMIT for request_module() calls without
CAP_SYS_MODULE to shake out other places where request_module_cap() is
needed.

4) Adapt the original patch series to add the per-process flag that
can block privilege elevations.

This will get us the improvement of removing any unexpectedly
unprivileged request_module() call sites for everyone, and still
provide the optional process flag to mark a process as not allowed to
make change its effective privilege to load a module.

Does this sound workable?

-Kees


[1] Just some greps to see the extent of request_module() use...

We've got a bit over 400 call sites:

$ git grep request_module'\b' | wc -l
438

About a third aren't using a format string, so they're loading either
a specific module or a specific alias:

$ git grep request_module'\b' | grep -v % | grep '"' | wc -l
160

Another third use format strings to load some aliased subset:

$ git grep request_module'\b' | grep % | wc -l
154

The rest construct their strings manually, making them less easy to
trivially examine.

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29 21:17                                     ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-29 21:17 UTC (permalink / raw)
  To: Linus Torvalds, Djalal Harouni, Jessica Yu
  Cc: LSM List, Linux Kernel Mailing List, kernel-hardening

On Wed, Nov 29, 2017 at 10:46 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, Nov 29, 2017 at 10:30 AM, Kees Cook <keescook@chromium.org> wrote:
>> On Tue, Nov 28, 2017 at 4:50 PM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>>>
>>> Just thinking about the DCCP case, where networking people actually
>>> knew it was pretty deprecated and had no real maintainer, I think one
>>> thing to look at would be simply a per-module flag.
>>>   [ ... ]
>>> Does that sound reasonable?
>>
>> Yeah, I think I see the way forward here; thanks for the discussion!
>
> Note: I don't want to really force that per-module flag if it ends up
> being painful, I was really just thinking that considering the DCCP
> case, it would be (I think) a really nice solution.
>
> In particular, request_module() ends up having that indirection
> through usermode-helper, which makes it potentially very inconvenient
> to store the "did the original caller have proper capabilities" and
> then check it at actual module load time.

Yeah, for this it seems like the permission _evaluation_ would happen
at request_module() time, and it could pass that as a module argument
to the usermode-helper, so that during load_module(), it would have
all the needed information to take an action:

- was this load considered privileged?
- is this module marked privileged-only?
- what are we configured to do in the case of mismatches?

> So the module flag is technically easy to add, and it's technically
> easy to read at module loading time, but I suspect that it's actually
> annoyingly hard to pass the original request_module() capability
> information around to where we actually read the module.

The clever solution grsecurity uses for this is to pass information as
module loading arguments (mentioned above) in call_modprobe(), and
then read them back in load_module(). This seemed like overkill to me,
but it does allow for delaying the decision on the action.

I think this would only be needed to deal with per-module markings,
though, since all other cases the evaluation and action could happen
at request_module() time. (More on this below.)

> That's why it might be _much_ easier to try to do it per call-site
> instead. It's not quite as fine-grained (because several call-sites do
> things like
>
>         request_module("net-pf-%d-proto-%d", PF_NETLINK, protocol);
>
> that can load a large number of different modules), but if we can get
> away with just saying "this particular callsite is ok, because it only
> loads the bluetooth module that we thing is trustworthy" or "this
> call-site is ok, because you already had to have access to the device
> node", that is going to be much simpler and more straightforward.

This is basically what we've historically already tried to do: limit
request_module() calls to either one or a set of const strings, or to
load with a format template. Removing all the
request_module(whatever_the_user_says) calls has (hopefully) been
completed, though sometimes weird stuff sneaks by.

The difficulty I see here is that it's not always clear which are
expected to be privileged loads or not. Even an analysis[1] of whether
or not request_module() is using a format template doesn't tell us
much. Is it an unprivileged net module like your example above or is
it some privileged device hotplug event like
request_module("ulpi:v%04xp%04x", ulpi->id.vendor, ulpi->id.product)?

> In other words: I think there are at least two different models to go
> after, and there may be practical reasons to prefer one over the
> other.

I would agree, and it's why I thought the original series was good: I
feel like we've already done the harder work of narrowing
request_module() calls by subsystem prefixes over the years, and
separating privilege with a new API didn't make sense to me. The
privileges are a state to check in the existing API, etc. But as I
said, if there's a way forward you're happy with, let's do it. It just
needs to get to the point where an admin can explicitly disable
unprivileged module autoloading with some appropriate level of
granularity.

[email merging your later reply...]
> One possibly interesting approach would be to run the usermode helper
> not as root, but with the credentials of the request_module() caller.
>
> That's arguably the right thing to do (in that request_module() would
> never do anything that the user wouldn't be able to do on their own)
> and probably what we should have done originally, but while it feels
> like a nice solution I suspect it would break pretty much every distro
> out there.
>
> Because they all expect modprobe/kmod to be called as root in the
> original init-namespace.

Dropping to the request_module() privileges for the usermode helper
would absolutely break the world. It would actually be much more
disruptive than the proposed series. :)

One of the (many) problems with module loading is that it changes the
global features available to the kernel: suddenly all existing
processes have access to whatever feature just got loaded (modulo a
syscall path to touch the code). This is a rather ugly from the
perspective of containers: unprivileged module loading can change the
kernel globally for all containers, the load gets triggered to run in
the init userspace, not the container, etc, etc. (Though loading from
the container would be nonsense too.)

So, what we have now is that the permission verification already
happens at and around the existing request_module() calls. Callers are
already checking caps within containers, for example, or isolating
which things can be loaded with subsystem format prefixes, etc. All
the loading logic is being done around the request_module() calls, and
once this is deemed "safe", we explicitly elevate privileges to
init-ns CAP_SYS_MODULE and load the module by way of the usermode
helper's call to finit_module(). As in, we have an intentional
CAP_SYS_MODULE privilege elevation for doing module loading.

What I like about the proposed request_modue_cap() is that is codifies
the specific privilege change "having this cap allows you to gain
CAP_SYS_MODULE to load this possible prefixed subset of modules",
instead of having that logic open-coded near the request_module() call
site. And we can add more of these explicit kinds of calls, but it
won't change the need for a way to say "no privilege elevations should
be allowed".

It still sounds like you'd like to see an explicit change, similar to
the proposed request_module_cap(), that identifies the privilege
expectations on a per-call-site basis. How about this plan:

1) Add request_module_cap(required_cap, module_name_prefix, fmt, fmt_args...)

2) Convert known privileged-but-not-CAP_SYS_MODULE request_module()
callers to request_module_cap(the_needed_cap, prefix, ...)

2) Convert known unprivileged callers to use request_module_cap(0, ...)

3) Add WARN_RATELIMIT for request_module() calls without
CAP_SYS_MODULE to shake out other places where request_module_cap() is
needed.

4) Adapt the original patch series to add the per-process flag that
can block privilege elevations.

This will get us the improvement of removing any unexpectedly
unprivileged request_module() call sites for everyone, and still
provide the optional process flag to mark a process as not allowed to
make change its effective privilege to load a module.

Does this sound workable?

-Kees


[1] Just some greps to see the extent of request_module() use...

We've got a bit over 400 call sites:

$ git grep request_module'\b' | wc -l
438

About a third aren't using a format string, so they're loading either
a specific module or a specific alias:

$ git grep request_module'\b' | grep -v % | grep '"' | wc -l
160

Another third use format strings to load some aliased subset:

$ git grep request_module'\b' | grep % | wc -l
154

The rest construct their strings manually, making them less easy to
trivially examine.

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-29 21:17                                     ` Kees Cook
  (?)
@ 2017-11-29 22:14                                       ` Linus Torvalds
  -1 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29 22:14 UTC (permalink / raw)
  To: Kees Cook
  Cc: Djalal Harouni, Jessica Yu, LSM List, Linux Kernel Mailing List,
	kernel-hardening

On Wed, Nov 29, 2017 at 1:17 PM, Kees Cook <keescook@chromium.org> wrote:
>
> So, what we have now is that the permission verification already
> happens at and around the existing request_module() calls.

Usually, yes.

I liked the "request_module_cap()" interface partly because that made
the net/core/dev_ioctl.c ones more explicit, and maybe it could be
convenient if we make other places do similar things.

I was hoping some other users could be converted, but grepping around,
there's no obvious cases. There is tcp_cong.c and tcp_ulp.c, but they
want some extra locking in between the checking..

> It still sounds like you'd like to see an explicit change, similar to
> the proposed request_module_cap(), that identifies the privilege
> expectations on a per-call-site basis. How about this plan:

Yes.

I'd be perfectly happy to have a long-range plan where the existing
"request_module()" ends up requiring more capabilities.

I just don't think it's a good first step, exactly because *if* it's a
first step, it basically has to be disabled by default.

And once you disable it by default, and it becomes purely opt-in, that
means that nothing will change for most cases. Some embedded people
that do their own thing (ie Android) might change, but normal
distributions probably won't.

Yes, Android may be 99% of the users, and yes, the embedded world in
general needs to be secure, but I'd still like this to be something
that helps _everybody_.

So:

> 1) Add request_module_cap(required_cap, module_name_prefix, fmt, fmt_args...)
>
> 2) Convert known privileged-but-not-CAP_SYS_MODULE request_module()
> callers to request_module_cap(the_needed_cap, prefix, ...)

Yes. The upside seems to be very limited here, but at least it makes
the users that use CAP_NET_ADMIN instead of CAP_SYS_MODULE able to
specify so.

> 2) Convert known unprivileged callers to use request_module_cap(0, ...)

0 is CAP_CHOWN, so it would have to be -1.

And I wouldn't actually want to see that as-is. Not only would I not
want to see people have that "-1" in random driver subsystems, I'd
much prefer to have actual helper naming that descibes why something
is ok

Because as mentioned, I think there are valid permission reasons that
are _not_ about capabilities that make you able to load a module.

If you can open a character device node, then "misc_open()" will do

                request_module("char-major-%d-%d", MISC_MAJOR, minor);

and there is nothing about capabilities in the CAP_SYS_MODULE sense
about the user. But the user _did_ have the privileges to open that
character device file.

That's why I suggested something like request_module_dev(): it's not
at all the same thing as request_module_cap(-1, ...), saying "I don't
need/have a capability". It's saying something else entirely, it's
basically saying "I have the right based on device permissions".

And something like request_module_dev() might even have real semantic
meaning, exactly because it says "this module request comes from
trying to open a device node".

Why would that be? If we know we're on a system where /dev is
auto-populated through udev, then the device nodes should have been
created by the drivers, not the other way around. So we might even
have a rule that notices that automatically, and simply disables
request_module_dev() entirely.

Anyway, I'm not saying that is necessarily something we should do, but
I do suspect that we could adapt to modern systems without having to
have tons of magic settings, and try to be as strict as possible
without breaking them.

Because I dislike "system tuning" in general. I hate knobs that do
kernel performance tuning - we try very hard to just DTRT wrt sizing
hashes etc instead of expecting the system admin to set flags.

And I think we can try to avoid some system tuning in this area too.

I suspect that for a lot of our existing request_module() cases, they
really are pretty trivial. In most cases, it's probably really about
whether you have the hardware or not.

So for the hardware driver cases, either the hardware enumerates
itself, or it is presumably set up by the system scripts anyway, and
CAP_SYS_MODULE is all fine. The "open device node" case is one special
case, though.

That mainly leaves the protocol ones we need to look out for, I suspect.

> 3) Add WARN_RATELIMIT for request_module() calls without
> CAP_SYS_MODULE to shake out other places where request_module_cap() is
> needed.

Yes.

And this is where I hope that there really aren't actually all that
many cases that will warn, and that it's hopefully easy to simply just
look at a handful of reports and say "ok, that case is obviously
fine".

And I may be wrong.

> 4) Adapt the original patch series to add the per-process flag that
> can block privilege elevations.

Yes.

            Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29 22:14                                       ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29 22:14 UTC (permalink / raw)
  To: linux-security-module

On Wed, Nov 29, 2017 at 1:17 PM, Kees Cook <keescook@chromium.org> wrote:
>
> So, what we have now is that the permission verification already
> happens at and around the existing request_module() calls.

Usually, yes.

I liked the "request_module_cap()" interface partly because that made
the net/core/dev_ioctl.c ones more explicit, and maybe it could be
convenient if we make other places do similar things.

I was hoping some other users could be converted, but grepping around,
there's no obvious cases. There is tcp_cong.c and tcp_ulp.c, but they
want some extra locking in between the checking..

> It still sounds like you'd like to see an explicit change, similar to
> the proposed request_module_cap(), that identifies the privilege
> expectations on a per-call-site basis. How about this plan:

Yes.

I'd be perfectly happy to have a long-range plan where the existing
"request_module()" ends up requiring more capabilities.

I just don't think it's a good first step, exactly because *if* it's a
first step, it basically has to be disabled by default.

And once you disable it by default, and it becomes purely opt-in, that
means that nothing will change for most cases. Some embedded people
that do their own thing (ie Android) might change, but normal
distributions probably won't.

Yes, Android may be 99% of the users, and yes, the embedded world in
general needs to be secure, but I'd still like this to be something
that helps _everybody_.

So:

> 1) Add request_module_cap(required_cap, module_name_prefix, fmt, fmt_args...)
>
> 2) Convert known privileged-but-not-CAP_SYS_MODULE request_module()
> callers to request_module_cap(the_needed_cap, prefix, ...)

Yes. The upside seems to be very limited here, but at least it makes
the users that use CAP_NET_ADMIN instead of CAP_SYS_MODULE able to
specify so.

> 2) Convert known unprivileged callers to use request_module_cap(0, ...)

0 is CAP_CHOWN, so it would have to be -1.

And I wouldn't actually want to see that as-is. Not only would I not
want to see people have that "-1" in random driver subsystems, I'd
much prefer to have actual helper naming that descibes why something
is ok

Because as mentioned, I think there are valid permission reasons that
are _not_ about capabilities that make you able to load a module.

If you can open a character device node, then "misc_open()" will do

                request_module("char-major-%d-%d", MISC_MAJOR, minor);

and there is nothing about capabilities in the CAP_SYS_MODULE sense
about the user. But the user _did_ have the privileges to open that
character device file.

That's why I suggested something like request_module_dev(): it's not
at all the same thing as request_module_cap(-1, ...), saying "I don't
need/have a capability". It's saying something else entirely, it's
basically saying "I have the right based on device permissions".

And something like request_module_dev() might even have real semantic
meaning, exactly because it says "this module request comes from
trying to open a device node".

Why would that be? If we know we're on a system where /dev is
auto-populated through udev, then the device nodes should have been
created by the drivers, not the other way around. So we might even
have a rule that notices that automatically, and simply disables
request_module_dev() entirely.

Anyway, I'm not saying that is necessarily something we should do, but
I do suspect that we could adapt to modern systems without having to
have tons of magic settings, and try to be as strict as possible
without breaking them.

Because I dislike "system tuning" in general. I hate knobs that do
kernel performance tuning - we try very hard to just DTRT wrt sizing
hashes etc instead of expecting the system admin to set flags.

And I think we can try to avoid some system tuning in this area too.

I suspect that for a lot of our existing request_module() cases, they
really are pretty trivial. In most cases, it's probably really about
whether you have the hardware or not.

So for the hardware driver cases, either the hardware enumerates
itself, or it is presumably set up by the system scripts anyway, and
CAP_SYS_MODULE is all fine. The "open device node" case is one special
case, though.

That mainly leaves the protocol ones we need to look out for, I suspect.

> 3) Add WARN_RATELIMIT for request_module() calls without
> CAP_SYS_MODULE to shake out other places where request_module_cap() is
> needed.

Yes.

And this is where I hope that there really aren't actually all that
many cases that will warn, and that it's hopefully easy to simply just
look at a handful of reports and say "ok, that case is obviously
fine".

And I may be wrong.

> 4) Adapt the original patch series to add the per-process flag that
> can block privilege elevations.

Yes.

            Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-29 22:14                                       ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29 22:14 UTC (permalink / raw)
  To: Kees Cook
  Cc: Djalal Harouni, Jessica Yu, LSM List, Linux Kernel Mailing List,
	kernel-hardening

On Wed, Nov 29, 2017 at 1:17 PM, Kees Cook <keescook@chromium.org> wrote:
>
> So, what we have now is that the permission verification already
> happens at and around the existing request_module() calls.

Usually, yes.

I liked the "request_module_cap()" interface partly because that made
the net/core/dev_ioctl.c ones more explicit, and maybe it could be
convenient if we make other places do similar things.

I was hoping some other users could be converted, but grepping around,
there's no obvious cases. There is tcp_cong.c and tcp_ulp.c, but they
want some extra locking in between the checking..

> It still sounds like you'd like to see an explicit change, similar to
> the proposed request_module_cap(), that identifies the privilege
> expectations on a per-call-site basis. How about this plan:

Yes.

I'd be perfectly happy to have a long-range plan where the existing
"request_module()" ends up requiring more capabilities.

I just don't think it's a good first step, exactly because *if* it's a
first step, it basically has to be disabled by default.

And once you disable it by default, and it becomes purely opt-in, that
means that nothing will change for most cases. Some embedded people
that do their own thing (ie Android) might change, but normal
distributions probably won't.

Yes, Android may be 99% of the users, and yes, the embedded world in
general needs to be secure, but I'd still like this to be something
that helps _everybody_.

So:

> 1) Add request_module_cap(required_cap, module_name_prefix, fmt, fmt_args...)
>
> 2) Convert known privileged-but-not-CAP_SYS_MODULE request_module()
> callers to request_module_cap(the_needed_cap, prefix, ...)

Yes. The upside seems to be very limited here, but at least it makes
the users that use CAP_NET_ADMIN instead of CAP_SYS_MODULE able to
specify so.

> 2) Convert known unprivileged callers to use request_module_cap(0, ...)

0 is CAP_CHOWN, so it would have to be -1.

And I wouldn't actually want to see that as-is. Not only would I not
want to see people have that "-1" in random driver subsystems, I'd
much prefer to have actual helper naming that descibes why something
is ok

Because as mentioned, I think there are valid permission reasons that
are _not_ about capabilities that make you able to load a module.

If you can open a character device node, then "misc_open()" will do

                request_module("char-major-%d-%d", MISC_MAJOR, minor);

and there is nothing about capabilities in the CAP_SYS_MODULE sense
about the user. But the user _did_ have the privileges to open that
character device file.

That's why I suggested something like request_module_dev(): it's not
at all the same thing as request_module_cap(-1, ...), saying "I don't
need/have a capability". It's saying something else entirely, it's
basically saying "I have the right based on device permissions".

And something like request_module_dev() might even have real semantic
meaning, exactly because it says "this module request comes from
trying to open a device node".

Why would that be? If we know we're on a system where /dev is
auto-populated through udev, then the device nodes should have been
created by the drivers, not the other way around. So we might even
have a rule that notices that automatically, and simply disables
request_module_dev() entirely.

Anyway, I'm not saying that is necessarily something we should do, but
I do suspect that we could adapt to modern systems without having to
have tons of magic settings, and try to be as strict as possible
without breaking them.

Because I dislike "system tuning" in general. I hate knobs that do
kernel performance tuning - we try very hard to just DTRT wrt sizing
hashes etc instead of expecting the system admin to set flags.

And I think we can try to avoid some system tuning in this area too.

I suspect that for a lot of our existing request_module() cases, they
really are pretty trivial. In most cases, it's probably really about
whether you have the hardware or not.

So for the hardware driver cases, either the hardware enumerates
itself, or it is presumably set up by the system scripts anyway, and
CAP_SYS_MODULE is all fine. The "open device node" case is one special
case, though.

That mainly leaves the protocol ones we need to look out for, I suspect.

> 3) Add WARN_RATELIMIT for request_module() calls without
> CAP_SYS_MODULE to shake out other places where request_module_cap() is
> needed.

Yes.

And this is where I hope that there really aren't actually all that
many cases that will warn, and that it's hopefully easy to simply just
look at a handful of reports and say "ok, that case is obviously
fine".

And I may be wrong.

> 4) Adapt the original patch series to add the per-process flag that
> can block privilege elevations.

Yes.

            Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-29 15:58                   ` David Miller
  (?)
  (?)
@ 2017-11-29 22:45                     ` Linus Torvalds
  -1 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29 22:45 UTC (permalink / raw)
  To: David Miller
  Cc: Theodore Ts'o, One Thousand Gnomes, Kees Cook,
	Luis R. Rodriguez, Djalal Harouni, Andrew Lutomirski,
	Andrew Morton, James Morris, Ben Hutchings, Solar Designer,
	Serge E. Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, Network Development,
	Peter Zijlstra

On Wed, Nov 29, 2017 at 7:58 AM, David Miller <davem@davemloft.net> wrote:
>
> We're talking about making sure that loading "ppp.ko" really gets
> ppp.ko rather than some_other_module.ko renamed to ppp.ko via some
> other mechanism.
>
> Both modules have legitimate signatures so the kernel will happily
> load both.

Yes. We could make the module name be part of the signing process, but
one problem with that is that at module loading time we don't actually
have the filename any more.

User space opens the file and then just feeds the data to the kernel.
So if you fooled modprobe into feeding the wrong module, that's it.

And yes, we can obviously embed the module name into the ELF headers
(that is all part of the signed payload), but the module name doesn't
actually necessarily match what we originally asked for.

Why? Module aliases and module dependencies - which is why we have
that user mode side at all. When we do "request_module(XYZ)" we don't
necessarily know what the dependencies are, so we expect modprobe to
just load the right modules.

So if modprobe then loads some other module (dccp or whatever), the
kernel has no real way to know "oh, that wasn't part of the dependency
chain for the module we aked for".

Now, if modprobe is taught to check that the filename of the module
that it opens actually matches the metadata in the ELF sections, that
would solve it, but it's out of the kernels hands..

             Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-29 22:45                     ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29 22:45 UTC (permalink / raw)
  To: David Miller
  Cc: Theodore Ts'o, One Thousand Gnomes, Kees Cook,
	Luis R. Rodriguez, Djalal Harouni, Andrew Lutomirski,
	Andrew Morton, James Morris, Ben Hutchings, Solar Designer,
	Serge E. Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, Network Development,
	Peter Zijlstra

On Wed, Nov 29, 2017 at 7:58 AM, David Miller <davem@davemloft.net> wrote:
>
> We're talking about making sure that loading "ppp.ko" really gets
> ppp.ko rather than some_other_module.ko renamed to ppp.ko via some
> other mechanism.
>
> Both modules have legitimate signatures so the kernel will happily
> load both.

Yes. We could make the module name be part of the signing process, but
one problem with that is that at module loading time we don't actually
have the filename any more.

User space opens the file and then just feeds the data to the kernel.
So if you fooled modprobe into feeding the wrong module, that's it.

And yes, we can obviously embed the module name into the ELF headers
(that is all part of the signed payload), but the module name doesn't
actually necessarily match what we originally asked for.

Why? Module aliases and module dependencies - which is why we have
that user mode side at all. When we do "request_module(XYZ)" we don't
necessarily know what the dependencies are, so we expect modprobe to
just load the right modules.

So if modprobe then loads some other module (dccp or whatever), the
kernel has no real way to know "oh, that wasn't part of the dependency
chain for the module we aked for".

Now, if modprobe is taught to check that the filename of the module
that it opens actually matches the metadata in the ELF sections, that
would solve it, but it's out of the kernels hands..

             Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-29 22:45                     ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29 22:45 UTC (permalink / raw)
  To: linux-security-module

On Wed, Nov 29, 2017 at 7:58 AM, David Miller <davem@davemloft.net> wrote:
>
> We're talking about making sure that loading "ppp.ko" really gets
> ppp.ko rather than some_other_module.ko renamed to ppp.ko via some
> other mechanism.
>
> Both modules have legitimate signatures so the kernel will happily
> load both.

Yes. We could make the module name be part of the signing process, but
one problem with that is that at module loading time we don't actually
have the filename any more.

User space opens the file and then just feeds the data to the kernel.
So if you fooled modprobe into feeding the wrong module, that's it.

And yes, we can obviously embed the module name into the ELF headers
(that is all part of the signed payload), but the module name doesn't
actually necessarily match what we originally asked for.

Why? Module aliases and module dependencies - which is why we have
that user mode side at all. When we do "request_module(XYZ)" we don't
necessarily know what the dependencies are, so we expect modprobe to
just load the right modules.

So if modprobe then loads some other module (dccp or whatever), the
kernel has no real way to know "oh, that wasn't part of the dependency
chain for the module we aked for".

Now, if modprobe is taught to check that the filename of the module
that it opens actually matches the metadata in the ELF sections, that
would solve it, but it's out of the kernels hands..

             Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-29 22:45                     ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-29 22:45 UTC (permalink / raw)
  To: David Miller
  Cc: Theodore Ts'o, One Thousand Gnomes, Kees Cook,
	Luis R. Rodriguez, Djalal Harouni, Andrew Lutomirski,
	Andrew Morton, James Morris, Ben Hutchings, Solar Designer,
	Serge E. Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, Network Development,
	Peter Zijlstra

On Wed, Nov 29, 2017 at 7:58 AM, David Miller <davem@davemloft.net> wrote:
>
> We're talking about making sure that loading "ppp.ko" really gets
> ppp.ko rather than some_other_module.ko renamed to ppp.ko via some
> other mechanism.
>
> Both modules have legitimate signatures so the kernel will happily
> load both.

Yes. We could make the module name be part of the signing process, but
one problem with that is that at module loading time we don't actually
have the filename any more.

User space opens the file and then just feeds the data to the kernel.
So if you fooled modprobe into feeding the wrong module, that's it.

And yes, we can obviously embed the module name into the ELF headers
(that is all part of the signed payload), but the module name doesn't
actually necessarily match what we originally asked for.

Why? Module aliases and module dependencies - which is why we have
that user mode side at all. When we do "request_module(XYZ)" we don't
necessarily know what the dependencies are, so we expect modprobe to
just load the right modules.

So if modprobe then loads some other module (dccp or whatever), the
kernel has no real way to know "oh, that wasn't part of the dependency
chain for the module we aked for".

Now, if modprobe is taught to check that the filename of the module
that it opens actually matches the metadata in the ELF sections, that
would solve it, but it's out of the kernels hands..

             Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-29 22:45                     ` Linus Torvalds
  (?)
  (?)
@ 2017-11-30  0:06                       ` Kees Cook
  -1 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-30  0:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Miller, Theodore Ts'o, One Thousand Gnomes,
	Luis R. Rodriguez, Djalal Harouni, Andrew Lutomirski,
	Andrew Morton, James Morris, Ben Hutchings, Solar Designer,
	Serge E. Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, Network Development,
	Peter Zijlstra

On Wed, Nov 29, 2017 at 2:45 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, Nov 29, 2017 at 7:58 AM, David Miller <davem@davemloft.net> wrote:
>>
>> We're talking about making sure that loading "ppp.ko" really gets
>> ppp.ko rather than some_other_module.ko renamed to ppp.ko via some
>> other mechanism.
>>
>> Both modules have legitimate signatures so the kernel will happily
>> load both.
>
> Yes. We could make the module name be part of the signing process, but
> one problem with that is that at module loading time we don't actually
> have the filename any more.

FWIW, I added this (well, KBUILD_MODNAME) to the module info just recently:

3e2e857f9c3a ("module: Add module name to modinfo")

> User space opens the file and then just feeds the data to the kernel.
> So if you fooled modprobe into feeding the wrong module, that's it.
>
> And yes, we can obviously embed the module name into the ELF headers
> (that is all part of the signed payload), but the module name doesn't
> actually necessarily match what we originally asked for.
>
> Why? Module aliases and module dependencies - which is why we have
> that user mode side at all. When we do "request_module(XYZ)" we don't
> necessarily know what the dependencies are, so we expect modprobe to
> just load the right modules.
>
> So if modprobe then loads some other module (dccp or whatever), the
> kernel has no real way to know "oh, that wasn't part of the dependency
> chain for the module we aked for".
>
> Now, if modprobe is taught to check that the filename of the module
> that it opens actually matches the metadata in the ELF sections, that
> would solve it, but it's out of the kernels hands..

Right, the aliases are why these kinds of renaming shenanigans don't
mean anything: it's not distinguishable from whatever modprobe.conf
ultimately tells modprobe to do.

If you can't trust your filesystem to hold your kernel modules
correctly, you have much bigger problems. (And yes, capabilities are a
problem here, since there are many paths to full root from individual
capabilities, but that's a known issue that is much larger than
tricking modprobe.)

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-30  0:06                       ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-30  0:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Miller, Theodore Ts'o, One Thousand Gnomes,
	Luis R. Rodriguez, Djalal Harouni, Andrew Lutomirski,
	Andrew Morton, James Morris, Ben Hutchings, Solar Designer,
	Serge E. Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, Network Development,
	Peter Zijlstra

On Wed, Nov 29, 2017 at 2:45 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, Nov 29, 2017 at 7:58 AM, David Miller <davem@davemloft.net> wrote:
>>
>> We're talking about making sure that loading "ppp.ko" really gets
>> ppp.ko rather than some_other_module.ko renamed to ppp.ko via some
>> other mechanism.
>>
>> Both modules have legitimate signatures so the kernel will happily
>> load both.
>
> Yes. We could make the module name be part of the signing process, but
> one problem with that is that at module loading time we don't actually
> have the filename any more.

FWIW, I added this (well, KBUILD_MODNAME) to the module info just recently:

3e2e857f9c3a ("module: Add module name to modinfo")

> User space opens the file and then just feeds the data to the kernel.
> So if you fooled modprobe into feeding the wrong module, that's it.
>
> And yes, we can obviously embed the module name into the ELF headers
> (that is all part of the signed payload), but the module name doesn't
> actually necessarily match what we originally asked for.
>
> Why? Module aliases and module dependencies - which is why we have
> that user mode side at all. When we do "request_module(XYZ)" we don't
> necessarily know what the dependencies are, so we expect modprobe to
> just load the right modules.
>
> So if modprobe then loads some other module (dccp or whatever), the
> kernel has no real way to know "oh, that wasn't part of the dependency
> chain for the module we aked for".
>
> Now, if modprobe is taught to check that the filename of the module
> that it opens actually matches the metadata in the ELF sections, that
> would solve it, but it's out of the kernels hands..

Right, the aliases are why these kinds of renaming shenanigans don't
mean anything: it's not distinguishable from whatever modprobe.conf
ultimately tells modprobe to do.

If you can't trust your filesystem to hold your kernel modules
correctly, you have much bigger problems. (And yes, capabilities are a
problem here, since there are many paths to full root from individual
capabilities, but that's a known issue that is much larger than
tricking modprobe.)

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-30  0:06                       ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-30  0:06 UTC (permalink / raw)
  To: linux-security-module

On Wed, Nov 29, 2017 at 2:45 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, Nov 29, 2017 at 7:58 AM, David Miller <davem@davemloft.net> wrote:
>>
>> We're talking about making sure that loading "ppp.ko" really gets
>> ppp.ko rather than some_other_module.ko renamed to ppp.ko via some
>> other mechanism.
>>
>> Both modules have legitimate signatures so the kernel will happily
>> load both.
>
> Yes. We could make the module name be part of the signing process, but
> one problem with that is that at module loading time we don't actually
> have the filename any more.

FWIW, I added this (well, KBUILD_MODNAME) to the module info just recently:

3e2e857f9c3a ("module: Add module name to modinfo")

> User space opens the file and then just feeds the data to the kernel.
> So if you fooled modprobe into feeding the wrong module, that's it.
>
> And yes, we can obviously embed the module name into the ELF headers
> (that is all part of the signed payload), but the module name doesn't
> actually necessarily match what we originally asked for.
>
> Why? Module aliases and module dependencies - which is why we have
> that user mode side at all. When we do "request_module(XYZ)" we don't
> necessarily know what the dependencies are, so we expect modprobe to
> just load the right modules.
>
> So if modprobe then loads some other module (dccp or whatever), the
> kernel has no real way to know "oh, that wasn't part of the dependency
> chain for the module we aked for".
>
> Now, if modprobe is taught to check that the filename of the module
> that it opens actually matches the metadata in the ELF sections, that
> would solve it, but it's out of the kernels hands..

Right, the aliases are why these kinds of renaming shenanigans don't
mean anything: it's not distinguishable from whatever modprobe.conf
ultimately tells modprobe to do.

If you can't trust your filesystem to hold your kernel modules
correctly, you have much bigger problems. (And yes, capabilities are a
problem here, since there are many paths to full root from individual
capabilities, but that's a known issue that is much larger than
tricking modprobe.)

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-30  0:06                       ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-30  0:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Miller, Theodore Ts'o, One Thousand Gnomes,
	Luis R. Rodriguez, Djalal Harouni, Andrew Lutomirski,
	Andrew Morton, James Morris, Ben Hutchings, Solar Designer,
	Serge E. Hallyn, Jessica Yu, Rusty Russell,
	Linux Kernel Mailing List, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, Network Development,
	Peter Zijlstra

On Wed, Nov 29, 2017 at 2:45 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, Nov 29, 2017 at 7:58 AM, David Miller <davem@davemloft.net> wrote:
>>
>> We're talking about making sure that loading "ppp.ko" really gets
>> ppp.ko rather than some_other_module.ko renamed to ppp.ko via some
>> other mechanism.
>>
>> Both modules have legitimate signatures so the kernel will happily
>> load both.
>
> Yes. We could make the module name be part of the signing process, but
> one problem with that is that at module loading time we don't actually
> have the filename any more.

FWIW, I added this (well, KBUILD_MODNAME) to the module info just recently:

3e2e857f9c3a ("module: Add module name to modinfo")

> User space opens the file and then just feeds the data to the kernel.
> So if you fooled modprobe into feeding the wrong module, that's it.
>
> And yes, we can obviously embed the module name into the ELF headers
> (that is all part of the signed payload), but the module name doesn't
> actually necessarily match what we originally asked for.
>
> Why? Module aliases and module dependencies - which is why we have
> that user mode side at all. When we do "request_module(XYZ)" we don't
> necessarily know what the dependencies are, so we expect modprobe to
> just load the right modules.
>
> So if modprobe then loads some other module (dccp or whatever), the
> kernel has no real way to know "oh, that wasn't part of the dependency
> chain for the module we aked for".
>
> Now, if modprobe is taught to check that the filename of the module
> that it opens actually matches the metadata in the ELF sections, that
> would solve it, but it's out of the kernels hands..

Right, the aliases are why these kinds of renaming shenanigans don't
mean anything: it's not distinguishable from whatever modprobe.conf
ultimately tells modprobe to do.

If you can't trust your filesystem to hold your kernel modules
correctly, you have much bigger problems. (And yes, capabilities are a
problem here, since there are many paths to full root from individual
capabilities, but that's a known issue that is much larger than
tricking modprobe.)

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-29 17:28                   ` Serge E. Hallyn
  (?)
@ 2017-11-30  0:35                     ` Theodore Ts'o
  -1 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-30  0:35 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: David Miller, gnomes, keescook, mcgrof, tixxdz, luto, akpm,
	james.l.morris, ben.hutchings, solar, jeyu, rusty, linux-kernel,
	linux-security-module, kernel-hardening, corbet, mingo, netdev,
	peterz, torvalds

On Wed, Nov 29, 2017 at 11:28:52AM -0600, Serge E. Hallyn wrote:
> 
> Just to be clear, module loading requires - and must always continue to
> require - CAP_SYS_MODULE against the initial user namespace.  Containers
> in user namespaces do not have that.
> 
> I don't believe anyone has ever claimed that containers which are not in
> a user namespace are in any way secure.

Unless the container performs some action which causes the kernel to
call request_module(), which then loads some kernel module,
potentially containing cr*p unmaintained code which was included when
the distro compiled the world, into the host kernel.

This is an attack vector that doesn't exist if you are using VM's.
And in general, the attack surface of the entire Linux
kernel<->userspace API is far larger than that which is exposed by the
guest<->host interface.

For that reason, containers are *far* more insecure than VM's, since
once the attacker gets root on the guest VM, they then have to attack
the hypervisor interface.  And if you compare the attack surface of
the two, it's pretty clear which is larger, and it's not the
hypervisor interface.

						- Ted

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-30  0:35                     ` Theodore Ts'o
  0 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-30  0:35 UTC (permalink / raw)
  To: linux-security-module

On Wed, Nov 29, 2017 at 11:28:52AM -0600, Serge E. Hallyn wrote:
> 
> Just to be clear, module loading requires - and must always continue to
> require - CAP_SYS_MODULE against the initial user namespace.  Containers
> in user namespaces do not have that.
> 
> I don't believe anyone has ever claimed that containers which are not in
> a user namespace are in any way secure.

Unless the container performs some action which causes the kernel to
call request_module(), which then loads some kernel module,
potentially containing cr*p unmaintained code which was included when
the distro compiled the world, into the host kernel.

This is an attack vector that doesn't exist if you are using VM's.
And in general, the attack surface of the entire Linux
kernel<->userspace API is far larger than that which is exposed by the
guest<->host interface.

For that reason, containers are *far* more insecure than VM's, since
once the attacker gets root on the guest VM, they then have to attack
the hypervisor interface.  And if you compare the attack surface of
the two, it's pretty clear which is larger, and it's not the
hypervisor interface.

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-30  0:35                     ` Theodore Ts'o
  0 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-30  0:35 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: David Miller, gnomes, keescook, mcgrof, tixxdz, luto, akpm,
	james.l.morris, ben.hutchings, solar, jeyu, rusty, linux-kernel,
	linux-security-module, kernel-hardening, corbet, mingo, netdev,
	peterz, torvalds

On Wed, Nov 29, 2017 at 11:28:52AM -0600, Serge E. Hallyn wrote:
> 
> Just to be clear, module loading requires - and must always continue to
> require - CAP_SYS_MODULE against the initial user namespace.  Containers
> in user namespaces do not have that.
> 
> I don't believe anyone has ever claimed that containers which are not in
> a user namespace are in any way secure.

Unless the container performs some action which causes the kernel to
call request_module(), which then loads some kernel module,
potentially containing cr*p unmaintained code which was included when
the distro compiled the world, into the host kernel.

This is an attack vector that doesn't exist if you are using VM's.
And in general, the attack surface of the entire Linux
kernel<->userspace API is far larger than that which is exposed by the
guest<->host interface.

For that reason, containers are *far* more insecure than VM's, since
once the attacker gets root on the guest VM, they then have to attack
the hypervisor interface.  And if you compare the attack surface of
the two, it's pretty clear which is larger, and it's not the
hypervisor interface.

						- Ted

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-29 22:14                                       ` Linus Torvalds
  (?)
@ 2017-11-30  0:44                                         ` Kees Cook
  -1 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-30  0:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Djalal Harouni, Jessica Yu, LSM List, Linux Kernel Mailing List,
	kernel-hardening

On Wed, Nov 29, 2017 at 2:14 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, Nov 29, 2017 at 1:17 PM, Kees Cook <keescook@chromium.org> wrote:
>> 1) Add request_module_cap(required_cap, module_name_prefix, fmt, fmt_args...)
>>
>> 2) Convert known privileged-but-not-CAP_SYS_MODULE request_module()
>> callers to request_module_cap(the_needed_cap, prefix, ...)
>
> Yes. The upside seems to be very limited here, but at least it makes
> the users that use CAP_NET_ADMIN instead of CAP_SYS_MODULE able to
> specify so.

Yeah, agreed; it's limited in scope.

>> 2) Convert known unprivileged callers to use request_module_cap(0, ...)
>
> 0 is CAP_CHOWN, so it would have to be -1.

Oops, yes, think-o.

> And I wouldn't actually want to see that as-is. Not only would I not
> want to see people have that "-1" in random driver subsystems, I'd
> much prefer to have actual helper naming that descibes why something
> is ok

So, if some catch-all is needed, maybe request_module_unpriv(...), but
I think it should be possible to identify SOME kind of defining
privilege to check.

> Because as mentioned, I think there are valid permission reasons that
> are _not_ about capabilities that make you able to load a module.
>
> If you can open a character device node, then "misc_open()" will do
>
>                 request_module("char-major-%d-%d", MISC_MAJOR, minor);
>
> and there is nothing about capabilities in the CAP_SYS_MODULE sense
> about the user. But the user _did_ have the privileges to open that
> character device file.
>
> That's why I suggested something like request_module_dev(): it's not
> at all the same thing as request_module_cap(-1, ...), saying "I don't
> need/have a capability". It's saying something else entirely, it's
> basically saying "I have the right based on device permissions".
>
> And something like request_module_dev() might even have real semantic
> meaning, exactly because it says "this module request comes from
> trying to open a device node".

I just want to make sure I'm visualizing this correctly. Using the
misc_open() example, you're thinking something like:

    request_module_dev(file, "char-major-%d-%d", MISC_MAJOR, minor);

which would then do a verification that file->f_cred matched
current_cred()? (For misc_open(), this is obviously going to always be
okay, but other cases may be further from the fops open handler...)

> Why would that be? If we know we're on a system where /dev is
> auto-populated through udev, then the device nodes should have been
> created by the drivers, not the other way around. So we might even
> have a rule that notices that automatically, and simply disables
> request_module_dev() entirely.

Right, we have both directions -- hotplug module loading ("I saw this
PCI alias, load something! Oh! It needs some fancy crypto too!") and
on-use module loading. It seems like on-use module loading would get
covered by request_module_dev(), and the hotplug case would always be
privileged. Anyway... exploring this with real code is clearly
warranted.

> I suspect that for a lot of our existing request_module() cases, they
> really are pretty trivial. In most cases, it's probably really about
> whether you have the hardware or not.

Right, and those would all be privileged, etc.

> So for the hardware driver cases, either the hardware enumerates
> itself, or it is presumably set up by the system scripts anyway, and
> CAP_SYS_MODULE is all fine. The "open device node" case is one special
> case, though.

Yeah, going through the call sites and asking "where does the
privilege for loading this module derive from?" for each one will help
shape the resulting API changes.

> That mainly leaves the protocol ones we need to look out for, I suspect.

This is where a lot of the exposure really comes from. socket()
triggers a bunch of stuff, but doesn't have an obvious privilege
associated with it... while it already does the name templates, maybe
add request_module_socket() just to explicitly mark it?

>> 3) Add WARN_RATELIMIT for request_module() calls without
>> CAP_SYS_MODULE to shake out other places where request_module_cap() is
>> needed.
>
> Yes.
>
> And this is where I hope that there really aren't actually all that
> many cases that will warn, and that it's hopefully easy to simply just
> look at a handful of reports and say "ok, that case is obviously
> fine".
>
> And I may be wrong.
>
>> 4) Adapt the original patch series to add the per-process flag that
>> can block privilege elevations.
>
> Yes.

Okay, excellent. Hopefully we haven't scared off Djalal, and hopefully
Jessica doesn't think this is all insane. :)

Thanks!

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-30  0:44                                         ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-30  0:44 UTC (permalink / raw)
  To: linux-security-module

On Wed, Nov 29, 2017 at 2:14 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, Nov 29, 2017 at 1:17 PM, Kees Cook <keescook@chromium.org> wrote:
>> 1) Add request_module_cap(required_cap, module_name_prefix, fmt, fmt_args...)
>>
>> 2) Convert known privileged-but-not-CAP_SYS_MODULE request_module()
>> callers to request_module_cap(the_needed_cap, prefix, ...)
>
> Yes. The upside seems to be very limited here, but at least it makes
> the users that use CAP_NET_ADMIN instead of CAP_SYS_MODULE able to
> specify so.

Yeah, agreed; it's limited in scope.

>> 2) Convert known unprivileged callers to use request_module_cap(0, ...)
>
> 0 is CAP_CHOWN, so it would have to be -1.

Oops, yes, think-o.

> And I wouldn't actually want to see that as-is. Not only would I not
> want to see people have that "-1" in random driver subsystems, I'd
> much prefer to have actual helper naming that descibes why something
> is ok

So, if some catch-all is needed, maybe request_module_unpriv(...), but
I think it should be possible to identify SOME kind of defining
privilege to check.

> Because as mentioned, I think there are valid permission reasons that
> are _not_ about capabilities that make you able to load a module.
>
> If you can open a character device node, then "misc_open()" will do
>
>                 request_module("char-major-%d-%d", MISC_MAJOR, minor);
>
> and there is nothing about capabilities in the CAP_SYS_MODULE sense
> about the user. But the user _did_ have the privileges to open that
> character device file.
>
> That's why I suggested something like request_module_dev(): it's not
> at all the same thing as request_module_cap(-1, ...), saying "I don't
> need/have a capability". It's saying something else entirely, it's
> basically saying "I have the right based on device permissions".
>
> And something like request_module_dev() might even have real semantic
> meaning, exactly because it says "this module request comes from
> trying to open a device node".

I just want to make sure I'm visualizing this correctly. Using the
misc_open() example, you're thinking something like:

    request_module_dev(file, "char-major-%d-%d", MISC_MAJOR, minor);

which would then do a verification that file->f_cred matched
current_cred()? (For misc_open(), this is obviously going to always be
okay, but other cases may be further from the fops open handler...)

> Why would that be? If we know we're on a system where /dev is
> auto-populated through udev, then the device nodes should have been
> created by the drivers, not the other way around. So we might even
> have a rule that notices that automatically, and simply disables
> request_module_dev() entirely.

Right, we have both directions -- hotplug module loading ("I saw this
PCI alias, load something! Oh! It needs some fancy crypto too!") and
on-use module loading. It seems like on-use module loading would get
covered by request_module_dev(), and the hotplug case would always be
privileged. Anyway... exploring this with real code is clearly
warranted.

> I suspect that for a lot of our existing request_module() cases, they
> really are pretty trivial. In most cases, it's probably really about
> whether you have the hardware or not.

Right, and those would all be privileged, etc.

> So for the hardware driver cases, either the hardware enumerates
> itself, or it is presumably set up by the system scripts anyway, and
> CAP_SYS_MODULE is all fine. The "open device node" case is one special
> case, though.

Yeah, going through the call sites and asking "where does the
privilege for loading this module derive from?" for each one will help
shape the resulting API changes.

> That mainly leaves the protocol ones we need to look out for, I suspect.

This is where a lot of the exposure really comes from. socket()
triggers a bunch of stuff, but doesn't have an obvious privilege
associated with it... while it already does the name templates, maybe
add request_module_socket() just to explicitly mark it?

>> 3) Add WARN_RATELIMIT for request_module() calls without
>> CAP_SYS_MODULE to shake out other places where request_module_cap() is
>> needed.
>
> Yes.
>
> And this is where I hope that there really aren't actually all that
> many cases that will warn, and that it's hopefully easy to simply just
> look at a handful of reports and say "ok, that case is obviously
> fine".
>
> And I may be wrong.
>
>> 4) Adapt the original patch series to add the per-process flag that
>> can block privilege elevations.
>
> Yes.

Okay, excellent. Hopefully we haven't scared off Djalal, and hopefully
Jessica doesn't think this is all insane. :)

Thanks!

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-30  0:44                                         ` Kees Cook
  0 siblings, 0 replies; 266+ messages in thread
From: Kees Cook @ 2017-11-30  0:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Djalal Harouni, Jessica Yu, LSM List, Linux Kernel Mailing List,
	kernel-hardening

On Wed, Nov 29, 2017 at 2:14 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, Nov 29, 2017 at 1:17 PM, Kees Cook <keescook@chromium.org> wrote:
>> 1) Add request_module_cap(required_cap, module_name_prefix, fmt, fmt_args...)
>>
>> 2) Convert known privileged-but-not-CAP_SYS_MODULE request_module()
>> callers to request_module_cap(the_needed_cap, prefix, ...)
>
> Yes. The upside seems to be very limited here, but at least it makes
> the users that use CAP_NET_ADMIN instead of CAP_SYS_MODULE able to
> specify so.

Yeah, agreed; it's limited in scope.

>> 2) Convert known unprivileged callers to use request_module_cap(0, ...)
>
> 0 is CAP_CHOWN, so it would have to be -1.

Oops, yes, think-o.

> And I wouldn't actually want to see that as-is. Not only would I not
> want to see people have that "-1" in random driver subsystems, I'd
> much prefer to have actual helper naming that descibes why something
> is ok

So, if some catch-all is needed, maybe request_module_unpriv(...), but
I think it should be possible to identify SOME kind of defining
privilege to check.

> Because as mentioned, I think there are valid permission reasons that
> are _not_ about capabilities that make you able to load a module.
>
> If you can open a character device node, then "misc_open()" will do
>
>                 request_module("char-major-%d-%d", MISC_MAJOR, minor);
>
> and there is nothing about capabilities in the CAP_SYS_MODULE sense
> about the user. But the user _did_ have the privileges to open that
> character device file.
>
> That's why I suggested something like request_module_dev(): it's not
> at all the same thing as request_module_cap(-1, ...), saying "I don't
> need/have a capability". It's saying something else entirely, it's
> basically saying "I have the right based on device permissions".
>
> And something like request_module_dev() might even have real semantic
> meaning, exactly because it says "this module request comes from
> trying to open a device node".

I just want to make sure I'm visualizing this correctly. Using the
misc_open() example, you're thinking something like:

    request_module_dev(file, "char-major-%d-%d", MISC_MAJOR, minor);

which would then do a verification that file->f_cred matched
current_cred()? (For misc_open(), this is obviously going to always be
okay, but other cases may be further from the fops open handler...)

> Why would that be? If we know we're on a system where /dev is
> auto-populated through udev, then the device nodes should have been
> created by the drivers, not the other way around. So we might even
> have a rule that notices that automatically, and simply disables
> request_module_dev() entirely.

Right, we have both directions -- hotplug module loading ("I saw this
PCI alias, load something! Oh! It needs some fancy crypto too!") and
on-use module loading. It seems like on-use module loading would get
covered by request_module_dev(), and the hotplug case would always be
privileged. Anyway... exploring this with real code is clearly
warranted.

> I suspect that for a lot of our existing request_module() cases, they
> really are pretty trivial. In most cases, it's probably really about
> whether you have the hardware or not.

Right, and those would all be privileged, etc.

> So for the hardware driver cases, either the hardware enumerates
> itself, or it is presumably set up by the system scripts anyway, and
> CAP_SYS_MODULE is all fine. The "open device node" case is one special
> case, though.

Yeah, going through the call sites and asking "where does the
privilege for loading this module derive from?" for each one will help
shape the resulting API changes.

> That mainly leaves the protocol ones we need to look out for, I suspect.

This is where a lot of the exposure really comes from. socket()
triggers a bunch of stuff, but doesn't have an obvious privilege
associated with it... while it already does the name templates, maybe
add request_module_socket() just to explicitly mark it?

>> 3) Add WARN_RATELIMIT for request_module() calls without
>> CAP_SYS_MODULE to shake out other places where request_module_cap() is
>> needed.
>
> Yes.
>
> And this is where I hope that there really aren't actually all that
> many cases that will warn, and that it's hopefully easy to simply just
> look at a handful of reports and say "ok, that case is obviously
> fine".
>
> And I may be wrong.
>
>> 4) Adapt the original patch series to add the per-process flag that
>> can block privilege elevations.
>
> Yes.

Okay, excellent. Hopefully we haven't scared off Djalal, and hopefully
Jessica doesn't think this is all insane. :)

Thanks!

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 3/5] modules:capabilities: automatic module loading restriction
  2017-11-27 17:18   ` Djalal Harouni
  (?)
@ 2017-11-30  1:23     ` Luis R. Rodriguez
  -1 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-30  1:23 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, linux-security-module,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	netdev, Peter Zijlstra, Linus Torvalds

On Mon, Nov 27, 2017 at 06:18:36PM +0100, Djalal Harouni wrote:
> diff --git a/include/linux/module.h b/include/linux/module.h
> index 5cbb239..c36aed8 100644
> --- a/include/linux/module.h
> +++ b/include/linux/module.h
> @@ -261,7 +261,16 @@ struct notifier_block;
>  
>  #ifdef CONFIG_MODULES
>  
> -extern int modules_disabled; /* for sysctl */
> +enum {
> +	MODULES_AUTOLOAD_ALLOWED	= 0,
> +	MODULES_AUTOLOAD_PRIVILEGED	= 1,
> +	MODULES_AUTOLOAD_DISABLED	= 2,
> +};
> +

Can you kdocify these and add a respective rst doc file?  Maybe stuff your
extensive docs which you are currently adding to
Documentation/sysctl/kernel.txt to this new file and in kernel.txt just refer
to it. This way this can be also nicely visibly documented on the web with the
new sphinx.

This way you can take advantage of the kdocs you are already adding and refer
to them.

> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 2fb4e27..0b6f0c8 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -683,6 +688,15 @@ static struct ctl_table kern_table[] = {
>  		.extra1		= &one,
>  		.extra2		= &one,
>  	},
> +	{
> +		.procname	= "modules_autoload_mode",
> +		.data		= &modules_autoload_mode,
> +		.maxlen		= sizeof(int),
> +		.mode		= 0644,
> +		.proc_handler	= modules_autoload_dointvec_minmax,

It would seem this is a unint ... so why not reflect that?

> @@ -2499,6 +2513,20 @@ static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write,
>  }
>  #endif
>  
> +#ifdef CONFIG_MODULES
> +static int modules_autoload_dointvec_minmax(struct ctl_table *table, int write,
> +				void __user *buffer, size_t *lenp, loff_t *ppos)
> +{
> +	/*
> +	 * Only CAP_SYS_MODULE in init user namespace are allowed to change this
> +	 */
> +	if (write && !capable(CAP_SYS_MODULE))
> +		return -EPERM;
> +
> +	return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
> +}
> +#endif

We now have proc_douintvec_minmax().

  Luis

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 3/5] modules:capabilities: automatic module loading restriction
@ 2017-11-30  1:23     ` Luis R. Rodriguez
  0 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-30  1:23 UTC (permalink / raw)
  To: linux-security-module

On Mon, Nov 27, 2017 at 06:18:36PM +0100, Djalal Harouni wrote:
> diff --git a/include/linux/module.h b/include/linux/module.h
> index 5cbb239..c36aed8 100644
> --- a/include/linux/module.h
> +++ b/include/linux/module.h
> @@ -261,7 +261,16 @@ struct notifier_block;
>  
>  #ifdef CONFIG_MODULES
>  
> -extern int modules_disabled; /* for sysctl */
> +enum {
> +	MODULES_AUTOLOAD_ALLOWED	= 0,
> +	MODULES_AUTOLOAD_PRIVILEGED	= 1,
> +	MODULES_AUTOLOAD_DISABLED	= 2,
> +};
> +

Can you kdocify these and add a respective rst doc file?  Maybe stuff your
extensive docs which you are currently adding to
Documentation/sysctl/kernel.txt to this new file and in kernel.txt just refer
to it. This way this can be also nicely visibly documented on the web with the
new sphinx.

This way you can take advantage of the kdocs you are already adding and refer
to them.

> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 2fb4e27..0b6f0c8 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -683,6 +688,15 @@ static struct ctl_table kern_table[] = {
>  		.extra1		= &one,
>  		.extra2		= &one,
>  	},
> +	{
> +		.procname	= "modules_autoload_mode",
> +		.data		= &modules_autoload_mode,
> +		.maxlen		= sizeof(int),
> +		.mode		= 0644,
> +		.proc_handler	= modules_autoload_dointvec_minmax,

It would seem this is a unint ... so why not reflect that?

> @@ -2499,6 +2513,20 @@ static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write,
>  }
>  #endif
>  
> +#ifdef CONFIG_MODULES
> +static int modules_autoload_dointvec_minmax(struct ctl_table *table, int write,
> +				void __user *buffer, size_t *lenp, loff_t *ppos)
> +{
> +	/*
> +	 * Only CAP_SYS_MODULE in init user namespace are allowed to change this
> +	 */
> +	if (write && !capable(CAP_SYS_MODULE))
> +		return -EPERM;
> +
> +	return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
> +}
> +#endif

We now have proc_douintvec_minmax().

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 3/5] modules:capabilities: automatic module loading restriction
@ 2017-11-30  1:23     ` Luis R. Rodriguez
  0 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-30  1:23 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, linux-security-module,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	netdev, Peter Zijlstra, Linus Torvalds

On Mon, Nov 27, 2017 at 06:18:36PM +0100, Djalal Harouni wrote:
> diff --git a/include/linux/module.h b/include/linux/module.h
> index 5cbb239..c36aed8 100644
> --- a/include/linux/module.h
> +++ b/include/linux/module.h
> @@ -261,7 +261,16 @@ struct notifier_block;
>  
>  #ifdef CONFIG_MODULES
>  
> -extern int modules_disabled; /* for sysctl */
> +enum {
> +	MODULES_AUTOLOAD_ALLOWED	= 0,
> +	MODULES_AUTOLOAD_PRIVILEGED	= 1,
> +	MODULES_AUTOLOAD_DISABLED	= 2,
> +};
> +

Can you kdocify these and add a respective rst doc file?  Maybe stuff your
extensive docs which you are currently adding to
Documentation/sysctl/kernel.txt to this new file and in kernel.txt just refer
to it. This way this can be also nicely visibly documented on the web with the
new sphinx.

This way you can take advantage of the kdocs you are already adding and refer
to them.

> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 2fb4e27..0b6f0c8 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -683,6 +688,15 @@ static struct ctl_table kern_table[] = {
>  		.extra1		= &one,
>  		.extra2		= &one,
>  	},
> +	{
> +		.procname	= "modules_autoload_mode",
> +		.data		= &modules_autoload_mode,
> +		.maxlen		= sizeof(int),
> +		.mode		= 0644,
> +		.proc_handler	= modules_autoload_dointvec_minmax,

It would seem this is a unint ... so why not reflect that?

> @@ -2499,6 +2513,20 @@ static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write,
>  }
>  #endif
>  
> +#ifdef CONFIG_MODULES
> +static int modules_autoload_dointvec_minmax(struct ctl_table *table, int write,
> +				void __user *buffer, size_t *lenp, loff_t *ppos)
> +{
> +	/*
> +	 * Only CAP_SYS_MODULE in init user namespace are allowed to change this
> +	 */
> +	if (write && !capable(CAP_SYS_MODULE))
> +		return -EPERM;
> +
> +	return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
> +}
> +#endif

We now have proc_douintvec_minmax().

  Luis

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 2/5] modules:capabilities: add cap_kernel_module_request() permission check
  2017-11-27 17:18   ` Djalal Harouni
  (?)
@ 2017-11-30  2:05     ` Luis R. Rodriguez
  -1 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-30  2:05 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, linux-security-module,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	netdev, Peter Zijlstra, Linus Torvalds

On Mon, Nov 27, 2017 at 06:18:35PM +0100, Djalal Harouni wrote:
> +/* Determine whether a module auto-load operation is permitted. */
> +int may_autoload_module(char *kmod_name, int required_cap,
> +			const char *kmod_prefix);
> +

While we are reviewing a general LSM for this, it has me wondering if an LSM or
userspace feed info may every want to use other possible context we could add for
free to make a determination.

For instance since all request_module() calls are in header files, we could 
for add for free THIS_MODULE as context to may_autoload_module() as well, so
struct module. The LSM could in theory then also help ensure only specific
modules are allowed to request a module load. Perhaps userspace could say
only built-in code could request certain modules.

Just a thought.

  Luis

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 2/5] modules:capabilities: add cap_kernel_module_request() permission check
@ 2017-11-30  2:05     ` Luis R. Rodriguez
  0 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-30  2:05 UTC (permalink / raw)
  To: linux-security-module

On Mon, Nov 27, 2017 at 06:18:35PM +0100, Djalal Harouni wrote:
> +/* Determine whether a module auto-load operation is permitted. */
> +int may_autoload_module(char *kmod_name, int required_cap,
> +			const char *kmod_prefix);
> +

While we are reviewing a general LSM for this, it has me wondering if an LSM or
userspace feed info may every want to use other possible context we could add for
free to make a determination.

For instance since all request_module() calls are in header files, we could 
for add for free THIS_MODULE as context to may_autoload_module() as well, so
struct module. The LSM could in theory then also help ensure only specific
modules are allowed to request a module load. Perhaps userspace could say
only built-in code could request certain modules.

Just a thought.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 2/5] modules:capabilities: add cap_kernel_module_request() permission check
@ 2017-11-30  2:05     ` Luis R. Rodriguez
  0 siblings, 0 replies; 266+ messages in thread
From: Luis R. Rodriguez @ 2017-11-30  2:05 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Luis R. Rodriguez,
	James Morris, Ben Hutchings, Solar Designer, Serge Hallyn,
	Jessica Yu, Rusty Russell, linux-kernel, linux-security-module,
	kernel-hardening, Jonathan Corbet, Ingo Molnar, David S. Miller,
	netdev, Peter Zijlstra, Linus Torvalds

On Mon, Nov 27, 2017 at 06:18:35PM +0100, Djalal Harouni wrote:
> +/* Determine whether a module auto-load operation is permitted. */
> +int may_autoload_module(char *kmod_name, int required_cap,
> +			const char *kmod_prefix);
> +

While we are reviewing a general LSM for this, it has me wondering if an LSM or
userspace feed info may every want to use other possible context we could add for
free to make a determination.

For instance since all request_module() calls are in header files, we could 
for add for free THIS_MODULE as context to may_autoload_module() as well, so
struct module. The LSM could in theory then also help ensure only specific
modules are allowed to request a module load. Perhaps userspace could say
only built-in code could request certain modules.

Just a thought.

  Luis

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-30  0:44                                         ` Kees Cook
  (?)
@ 2017-11-30  2:08                                           ` Linus Torvalds
  -1 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-30  2:08 UTC (permalink / raw)
  To: Kees Cook
  Cc: Djalal Harouni, Jessica Yu, LSM List, Linux Kernel Mailing List,
	kernel-hardening

On Wed, Nov 29, 2017 at 4:44 PM, Kees Cook <keescook@chromium.org> wrote:
>
>> That mainly leaves the protocol ones we need to look out for, I suspect.
>
> This is where a lot of the exposure really comes from. socket()
> triggers a bunch of stuff, but doesn't have an obvious privilege
> associated with it... while it already does the name templates, maybe
> add request_module_socket() just to explicitly mark it?

.. and this is where I'd expect that maybe we'd need some hackery.

Even including some ad-hoc rules like "this module is actually
maintained", possibly even with some /sys interface to extend/reduce
that set.

But maybe it's not even that bad.

                Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-30  2:08                                           ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-30  2:08 UTC (permalink / raw)
  To: linux-security-module

On Wed, Nov 29, 2017 at 4:44 PM, Kees Cook <keescook@chromium.org> wrote:
>
>> That mainly leaves the protocol ones we need to look out for, I suspect.
>
> This is where a lot of the exposure really comes from. socket()
> triggers a bunch of stuff, but doesn't have an obvious privilege
> associated with it... while it already does the name templates, maybe
> add request_module_socket() just to explicitly mark it?

.. and this is where I'd expect that maybe we'd need some hackery.

Even including some ad-hoc rules like "this module is actually
maintained", possibly even with some /sys interface to extend/reduce
that set.

But maybe it's not even that bad.

                Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-30  2:08                                           ` Linus Torvalds
  0 siblings, 0 replies; 266+ messages in thread
From: Linus Torvalds @ 2017-11-30  2:08 UTC (permalink / raw)
  To: Kees Cook
  Cc: Djalal Harouni, Jessica Yu, LSM List, Linux Kernel Mailing List,
	kernel-hardening

On Wed, Nov 29, 2017 at 4:44 PM, Kees Cook <keescook@chromium.org> wrote:
>
>> That mainly leaves the protocol ones we need to look out for, I suspect.
>
> This is where a lot of the exposure really comes from. socket()
> triggers a bunch of stuff, but doesn't have an obvious privilege
> associated with it... while it already does the name templates, maybe
> add request_module_socket() just to explicitly mark it?

.. and this is where I'd expect that maybe we'd need some hackery.

Even including some ad-hoc rules like "this module is actually
maintained", possibly even with some /sys interface to extend/reduce
that set.

But maybe it's not even that bad.

                Linus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-29 22:14                                       ` Linus Torvalds
  (?)
@ 2017-11-30  6:51                                         ` Daniel Micay
  -1 siblings, 0 replies; 266+ messages in thread
From: Daniel Micay @ 2017-11-30  6:51 UTC (permalink / raw)
  To: Linus Torvalds, Kees Cook
  Cc: Djalal Harouni, Jessica Yu, LSM List, Linux Kernel Mailing List,
	kernel-hardening

> And once you disable it by default, and it becomes purely opt-in, that
> means that nothing will change for most cases. Some embedded people
> that do their own thing (ie Android) might change, but normal
> distributions probably won't.
> 
> Yes, Android may be 99% of the users, and yes, the embedded world in
> general needs to be secure, but I'd still like this to be something
> that helps _everybody_.

Android devices won't get much benefit since they ship a tiny set of
modules chosen for the device. The kernels already get very stripped
down to the bare minimum vs. enabling every feature and driver available
and shipping it all by default on a traditional distribution.

Lots of potential module attack surface also gets eliminated by default
via their SELinux whitelists for /dev, /sys, /proc, debugfs, ioctl
commands, etc. The global seccomp whitelist might be relevant in some
cases too.

Android devices like to build everything into the kernel too, so even if
they weren't using a module this feature wouldn't usually help them. It
would need to work like this existing sysctl:

    net.ipv4.tcp_available_congestion_control = cubic reno lp

i.e. whitelists for functionality offered by the modules, not just
whether they can be loaded.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-30  6:51                                         ` Daniel Micay
  0 siblings, 0 replies; 266+ messages in thread
From: Daniel Micay @ 2017-11-30  6:51 UTC (permalink / raw)
  To: linux-security-module

> And once you disable it by default, and it becomes purely opt-in, that
> means that nothing will change for most cases. Some embedded people
> that do their own thing (ie Android) might change, but normal
> distributions probably won't.
> 
> Yes, Android may be 99% of the users, and yes, the embedded world in
> general needs to be secure, but I'd still like this to be something
> that helps _everybody_.

Android devices won't get much benefit since they ship a tiny set of
modules chosen for the device. The kernels already get very stripped
down to the bare minimum vs. enabling every feature and driver available
and shipping it all by default on a traditional distribution.

Lots of potential module attack surface also gets eliminated by default
via their SELinux whitelists for /dev, /sys, /proc, debugfs, ioctl
commands, etc. The global seccomp whitelist might be relevant in some
cases too.

Android devices like to build everything into the kernel too, so even if
they weren't using a module this feature wouldn't usually help them. It
would need to work like this existing sysctl:

    net.ipv4.tcp_available_congestion_control = cubic reno lp

i.e. whitelists for functionality offered by the modules, not just
whether they can be loaded.
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-30  6:51                                         ` Daniel Micay
  0 siblings, 0 replies; 266+ messages in thread
From: Daniel Micay @ 2017-11-30  6:51 UTC (permalink / raw)
  To: Linus Torvalds, Kees Cook
  Cc: Djalal Harouni, Jessica Yu, LSM List, Linux Kernel Mailing List,
	kernel-hardening

> And once you disable it by default, and it becomes purely opt-in, that
> means that nothing will change for most cases. Some embedded people
> that do their own thing (ie Android) might change, but normal
> distributions probably won't.
> 
> Yes, Android may be 99% of the users, and yes, the embedded world in
> general needs to be secure, but I'd still like this to be something
> that helps _everybody_.

Android devices won't get much benefit since they ship a tiny set of
modules chosen for the device. The kernels already get very stripped
down to the bare minimum vs. enabling every feature and driver available
and shipping it all by default on a traditional distribution.

Lots of potential module attack surface also gets eliminated by default
via their SELinux whitelists for /dev, /sys, /proc, debugfs, ioctl
commands, etc. The global seccomp whitelist might be relevant in some
cases too.

Android devices like to build everything into the kernel too, so even if
they weren't using a module this feature wouldn't usually help them. It
would need to work like this existing sysctl:

    net.ipv4.tcp_available_congestion_control = cubic reno lp

i.e. whitelists for functionality offered by the modules, not just
whether they can be loaded.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-30  6:51                                         ` Daniel Micay
  (?)
@ 2017-11-30  8:50                                           ` Djalal Harouni
  -1 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-30  8:50 UTC (permalink / raw)
  To: Daniel Micay
  Cc: Linus Torvalds, Kees Cook, Jessica Yu, LSM List,
	Linux Kernel Mailing List, kernel-hardening

On Thu, Nov 30, 2017 at 7:51 AM, Daniel Micay <danielmicay@gmail.com> wrote:
[...]
> Lots of potential module attack surface also gets eliminated by default
> via their SELinux whitelists for /dev, /sys, /proc, debugfs, ioctl
> commands, etc. The global seccomp whitelist might be relevant in some
> cases too.

In embedded systems we can't maintain a SELinux policy, distro man
power hardly manage. We have abstracted seccomp etc, but the kernel
inherited the difficult multiplex things, plus all other paths that
trigger this.


> Android devices like to build everything into the kernel too, so even if
> they weren't using a module this feature wouldn't usually help them. It
> would need to work like this existing sysctl:
>
>     net.ipv4.tcp_available_congestion_control = cubic reno lp
>
> i.e. whitelists for functionality offered by the modules, not just
> whether they can be loaded.

Yes, but it is hard to maintain a whitelist policy, the code is hardly
maintained... if you include everything you should have an LSM policy
or something like that, and compiling kernels is expert thing.
Otherwise IMHO the kernel should provide default secure behaviour on
how to load or add new functionality to the running one. From a user
perspective, a switch "yes/no" that a privileged entity will
*understand* and assume is what should be there, and the switch or
flag as discussed here is local to processes, the sysctl will be
removed. IMO it should come from userspace point of view, cause as an
example the sysctl:

net.ipv4.tcp_available_congestion_control = cubic reno lp

Is kernel thing, too technical, userspace developers, admins or
privileged entity will not understand what cubic or reno mean. Doing
the same per functionality directly like this seems to much of a
burden compared to the use case. The kernel maybe can do this to
advance the art of the networking stack and for advanced cases, but in
IMHO a sane default behaviour + an abstracted process/sandbox flag
"yes/no" for most others, userspace developers and humans is what
should be provided and we need the kernel to help here.

It seems that Linus and kees agreed on this direction which allows me
to follow up.

Thanks!


-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-30  8:50                                           ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-30  8:50 UTC (permalink / raw)
  To: linux-security-module

On Thu, Nov 30, 2017 at 7:51 AM, Daniel Micay <danielmicay@gmail.com> wrote:
[...]
> Lots of potential module attack surface also gets eliminated by default
> via their SELinux whitelists for /dev, /sys, /proc, debugfs, ioctl
> commands, etc. The global seccomp whitelist might be relevant in some
> cases too.

In embedded systems we can't maintain a SELinux policy, distro man
power hardly manage. We have abstracted seccomp etc, but the kernel
inherited the difficult multiplex things, plus all other paths that
trigger this.


> Android devices like to build everything into the kernel too, so even if
> they weren't using a module this feature wouldn't usually help them. It
> would need to work like this existing sysctl:
>
>     net.ipv4.tcp_available_congestion_control = cubic reno lp
>
> i.e. whitelists for functionality offered by the modules, not just
> whether they can be loaded.

Yes, but it is hard to maintain a whitelist policy, the code is hardly
maintained... if you include everything you should have an LSM policy
or something like that, and compiling kernels is expert thing.
Otherwise IMHO the kernel should provide default secure behaviour on
how to load or add new functionality to the running one. From a user
perspective, a switch "yes/no" that a privileged entity will
*understand* and assume is what should be there, and the switch or
flag as discussed here is local to processes, the sysctl will be
removed. IMO it should come from userspace point of view, cause as an
example the sysctl:

net.ipv4.tcp_available_congestion_control = cubic reno lp

Is kernel thing, too technical, userspace developers, admins or
privileged entity will not understand what cubic or reno mean. Doing
the same per functionality directly like this seems to much of a
burden compared to the use case. The kernel maybe can do this to
advance the art of the networking stack and for advanced cases, but in
IMHO a sane default behaviour + an abstracted process/sandbox flag
"yes/no" for most others, userspace developers and humans is what
should be provided and we need the kernel to help here.

It seems that Linus and kees agreed on this direction which allows me
to follow up.

Thanks!


-- 
tixxdz
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-30  8:50                                           ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-30  8:50 UTC (permalink / raw)
  To: Daniel Micay
  Cc: Linus Torvalds, Kees Cook, Jessica Yu, LSM List,
	Linux Kernel Mailing List, kernel-hardening

On Thu, Nov 30, 2017 at 7:51 AM, Daniel Micay <danielmicay@gmail.com> wrote:
[...]
> Lots of potential module attack surface also gets eliminated by default
> via their SELinux whitelists for /dev, /sys, /proc, debugfs, ioctl
> commands, etc. The global seccomp whitelist might be relevant in some
> cases too.

In embedded systems we can't maintain a SELinux policy, distro man
power hardly manage. We have abstracted seccomp etc, but the kernel
inherited the difficult multiplex things, plus all other paths that
trigger this.


> Android devices like to build everything into the kernel too, so even if
> they weren't using a module this feature wouldn't usually help them. It
> would need to work like this existing sysctl:
>
>     net.ipv4.tcp_available_congestion_control = cubic reno lp
>
> i.e. whitelists for functionality offered by the modules, not just
> whether they can be loaded.

Yes, but it is hard to maintain a whitelist policy, the code is hardly
maintained... if you include everything you should have an LSM policy
or something like that, and compiling kernels is expert thing.
Otherwise IMHO the kernel should provide default secure behaviour on
how to load or add new functionality to the running one. From a user
perspective, a switch "yes/no" that a privileged entity will
*understand* and assume is what should be there, and the switch or
flag as discussed here is local to processes, the sysctl will be
removed. IMO it should come from userspace point of view, cause as an
example the sysctl:

net.ipv4.tcp_available_congestion_control = cubic reno lp

Is kernel thing, too technical, userspace developers, admins or
privileged entity will not understand what cubic or reno mean. Doing
the same per functionality directly like this seems to much of a
burden compared to the use case. The kernel maybe can do this to
advance the art of the networking stack and for advanced cases, but in
IMHO a sane default behaviour + an abstracted process/sandbox flag
"yes/no" for most others, userspace developers and humans is what
should be provided and we need the kernel to help here.

It seems that Linus and kees agreed on this direction which allows me
to follow up.

Thanks!


-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 3/5] modules:capabilities: automatic module loading restriction
  2017-11-30  1:23     ` Luis R. Rodriguez
  (?)
@ 2017-11-30 12:22       ` Djalal Harouni
  -1 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-30 12:22 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, James Morris,
	Ben Hutchings, Solar Designer, Serge Hallyn, Jessica Yu,
	Rusty Russell, linux-kernel, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

On Thu, Nov 30, 2017 at 2:23 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Mon, Nov 27, 2017 at 06:18:36PM +0100, Djalal Harouni wrote:
>> diff --git a/include/linux/module.h b/include/linux/module.h
>> index 5cbb239..c36aed8 100644
>> --- a/include/linux/module.h
>> +++ b/include/linux/module.h
>> @@ -261,7 +261,16 @@ struct notifier_block;
>>
>>  #ifdef CONFIG_MODULES
>>
>> -extern int modules_disabled; /* for sysctl */
>> +enum {
>> +     MODULES_AUTOLOAD_ALLOWED        = 0,
>> +     MODULES_AUTOLOAD_PRIVILEGED     = 1,
>> +     MODULES_AUTOLOAD_DISABLED       = 2,
>> +};
>> +
>
> Can you kdocify these and add a respective rst doc file?  Maybe stuff your
> extensive docs which you are currently adding to
> Documentation/sysctl/kernel.txt to this new file and in kernel.txt just refer
> to it. This way this can be also nicely visibly documented on the web with the
> new sphinx.
>
> This way you can take advantage of the kdocs you are already adding and refer
> to them.

Alright I'll do it in the next series next week, we'll change the
semantics as requested by Linus and Kees here:
http://www.openwall.com/lists/kernel-hardening/2017/11/29/38

To block the privilege escalation through the usermod helper.


>> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
>> index 2fb4e27..0b6f0c8 100644
>> --- a/kernel/sysctl.c
>> +++ b/kernel/sysctl.c
>> @@ -683,6 +688,15 @@ static struct ctl_table kern_table[] = {
>>               .extra1         = &one,
>>               .extra2         = &one,
>>       },
>> +     {
>> +             .procname       = "modules_autoload_mode",
>> +             .data           = &modules_autoload_mode,
>> +             .maxlen         = sizeof(int),
>> +             .mode           = 0644,
>> +             .proc_handler   = modules_autoload_dointvec_minmax,
>
> It would seem this is a unint ... so why not reflect that?
>
>> @@ -2499,6 +2513,20 @@ static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write,
>>  }
>>  #endif
>>
>> +#ifdef CONFIG_MODULES
>> +static int modules_autoload_dointvec_minmax(struct ctl_table *table, int write,
>> +                             void __user *buffer, size_t *lenp, loff_t *ppos)
>> +{
>> +     /*
>> +      * Only CAP_SYS_MODULE in init user namespace are allowed to change this
>> +      */
>> +     if (write && !capable(CAP_SYS_MODULE))
>> +             return -EPERM;
>> +
>> +     return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
>> +}
>> +#endif
>
> We now have proc_douintvec_minmax().
>

Yes, however in that same response by Linus it was suggested to drop
the sysctl completely, so next iterations will not have this code.

Thank you for the review!

-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 3/5] modules:capabilities: automatic module loading restriction
@ 2017-11-30 12:22       ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-30 12:22 UTC (permalink / raw)
  To: linux-security-module

On Thu, Nov 30, 2017 at 2:23 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Mon, Nov 27, 2017 at 06:18:36PM +0100, Djalal Harouni wrote:
>> diff --git a/include/linux/module.h b/include/linux/module.h
>> index 5cbb239..c36aed8 100644
>> --- a/include/linux/module.h
>> +++ b/include/linux/module.h
>> @@ -261,7 +261,16 @@ struct notifier_block;
>>
>>  #ifdef CONFIG_MODULES
>>
>> -extern int modules_disabled; /* for sysctl */
>> +enum {
>> +     MODULES_AUTOLOAD_ALLOWED        = 0,
>> +     MODULES_AUTOLOAD_PRIVILEGED     = 1,
>> +     MODULES_AUTOLOAD_DISABLED       = 2,
>> +};
>> +
>
> Can you kdocify these and add a respective rst doc file?  Maybe stuff your
> extensive docs which you are currently adding to
> Documentation/sysctl/kernel.txt to this new file and in kernel.txt just refer
> to it. This way this can be also nicely visibly documented on the web with the
> new sphinx.
>
> This way you can take advantage of the kdocs you are already adding and refer
> to them.

Alright I'll do it in the next series next week, we'll change the
semantics as requested by Linus and Kees here:
http://www.openwall.com/lists/kernel-hardening/2017/11/29/38

To block the privilege escalation through the usermod helper.


>> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
>> index 2fb4e27..0b6f0c8 100644
>> --- a/kernel/sysctl.c
>> +++ b/kernel/sysctl.c
>> @@ -683,6 +688,15 @@ static struct ctl_table kern_table[] = {
>>               .extra1         = &one,
>>               .extra2         = &one,
>>       },
>> +     {
>> +             .procname       = "modules_autoload_mode",
>> +             .data           = &modules_autoload_mode,
>> +             .maxlen         = sizeof(int),
>> +             .mode           = 0644,
>> +             .proc_handler   = modules_autoload_dointvec_minmax,
>
> It would seem this is a unint ... so why not reflect that?
>
>> @@ -2499,6 +2513,20 @@ static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write,
>>  }
>>  #endif
>>
>> +#ifdef CONFIG_MODULES
>> +static int modules_autoload_dointvec_minmax(struct ctl_table *table, int write,
>> +                             void __user *buffer, size_t *lenp, loff_t *ppos)
>> +{
>> +     /*
>> +      * Only CAP_SYS_MODULE in init user namespace are allowed to change this
>> +      */
>> +     if (write && !capable(CAP_SYS_MODULE))
>> +             return -EPERM;
>> +
>> +     return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
>> +}
>> +#endif
>
> We now have proc_douintvec_minmax().
>

Yes, however in that same response by Linus it was suggested to drop
the sysctl completely, so next iterations will not have this code.

Thank you for the review!

-- 
tixxdz
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 3/5] modules:capabilities: automatic module loading restriction
@ 2017-11-30 12:22       ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-30 12:22 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Kees Cook, Andy Lutomirski, Andrew Morton, James Morris,
	Ben Hutchings, Solar Designer, Serge Hallyn, Jessica Yu,
	Rusty Russell, linux-kernel, LSM List, kernel-hardening,
	Jonathan Corbet, Ingo Molnar, David S. Miller,
	Network Development, Peter Zijlstra, Linus Torvalds

On Thu, Nov 30, 2017 at 2:23 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Mon, Nov 27, 2017 at 06:18:36PM +0100, Djalal Harouni wrote:
>> diff --git a/include/linux/module.h b/include/linux/module.h
>> index 5cbb239..c36aed8 100644
>> --- a/include/linux/module.h
>> +++ b/include/linux/module.h
>> @@ -261,7 +261,16 @@ struct notifier_block;
>>
>>  #ifdef CONFIG_MODULES
>>
>> -extern int modules_disabled; /* for sysctl */
>> +enum {
>> +     MODULES_AUTOLOAD_ALLOWED        = 0,
>> +     MODULES_AUTOLOAD_PRIVILEGED     = 1,
>> +     MODULES_AUTOLOAD_DISABLED       = 2,
>> +};
>> +
>
> Can you kdocify these and add a respective rst doc file?  Maybe stuff your
> extensive docs which you are currently adding to
> Documentation/sysctl/kernel.txt to this new file and in kernel.txt just refer
> to it. This way this can be also nicely visibly documented on the web with the
> new sphinx.
>
> This way you can take advantage of the kdocs you are already adding and refer
> to them.

Alright I'll do it in the next series next week, we'll change the
semantics as requested by Linus and Kees here:
http://www.openwall.com/lists/kernel-hardening/2017/11/29/38

To block the privilege escalation through the usermod helper.


>> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
>> index 2fb4e27..0b6f0c8 100644
>> --- a/kernel/sysctl.c
>> +++ b/kernel/sysctl.c
>> @@ -683,6 +688,15 @@ static struct ctl_table kern_table[] = {
>>               .extra1         = &one,
>>               .extra2         = &one,
>>       },
>> +     {
>> +             .procname       = "modules_autoload_mode",
>> +             .data           = &modules_autoload_mode,
>> +             .maxlen         = sizeof(int),
>> +             .mode           = 0644,
>> +             .proc_handler   = modules_autoload_dointvec_minmax,
>
> It would seem this is a unint ... so why not reflect that?
>
>> @@ -2499,6 +2513,20 @@ static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write,
>>  }
>>  #endif
>>
>> +#ifdef CONFIG_MODULES
>> +static int modules_autoload_dointvec_minmax(struct ctl_table *table, int write,
>> +                             void __user *buffer, size_t *lenp, loff_t *ppos)
>> +{
>> +     /*
>> +      * Only CAP_SYS_MODULE in init user namespace are allowed to change this
>> +      */
>> +     if (write && !capable(CAP_SYS_MODULE))
>> +             return -EPERM;
>> +
>> +     return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
>> +}
>> +#endif
>
> We now have proc_douintvec_minmax().
>

Yes, however in that same response by Linus it was suggested to drop
the sysctl completely, so next iterations will not have this code.

Thank you for the review!

-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-30  8:50                                           ` Djalal Harouni
  (?)
@ 2017-11-30 14:16                                             ` Theodore Ts'o
  -1 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-30 14:16 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Daniel Micay, Linus Torvalds, Kees Cook, Jessica Yu, LSM List,
	Linux Kernel Mailing List, kernel-hardening

On Thu, Nov 30, 2017 at 09:50:27AM +0100, Djalal Harouni wrote:
> In embedded systems we can't maintain a SELinux policy, distro man
> power hardly manage. We have abstracted seccomp etc, but the kernel
> inherited the difficult multiplex things, plus all other paths that
> trigger this.....

> Yes, but it is hard to maintain a whitelist policy, the code is hardly
> maintained...

So this is the part that scares me to death about IOT, and why I tell
everyone to ***never*** trust an IOT device on their home network, and
***never*** trust it with anything you don't mind splattered all over
the front page of NY Times and RT / Sputnick news.

You're saying that you want to use modules (as opposed to compile
everything tightly down to just what you need for the embedded
system); that the code is "hardly maintained".  And yet we're supposed
to consider it trustworthy?

If that's the case, turning off implicit module loading sounds and
thinking that this will somehow be a magic wand sounds.... crazy.

	      	   		     - Ted

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-30 14:16                                             ` Theodore Ts'o
  0 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-30 14:16 UTC (permalink / raw)
  To: linux-security-module

On Thu, Nov 30, 2017 at 09:50:27AM +0100, Djalal Harouni wrote:
> In embedded systems we can't maintain a SELinux policy, distro man
> power hardly manage. We have abstracted seccomp etc, but the kernel
> inherited the difficult multiplex things, plus all other paths that
> trigger this.....

> Yes, but it is hard to maintain a whitelist policy, the code is hardly
> maintained...

So this is the part that scares me to death about IOT, and why I tell
everyone to ***never*** trust an IOT device on their home network, and
***never*** trust it with anything you don't mind splattered all over
the front page of NY Times and RT / Sputnick news.

You're saying that you want to use modules (as opposed to compile
everything tightly down to just what you need for the embedded
system); that the code is "hardly maintained".  And yet we're supposed
to consider it trustworthy?

If that's the case, turning off implicit module loading sounds and
thinking that this will somehow be a magic wand sounds.... crazy.

	      	   		     - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-30 14:16                                             ` Theodore Ts'o
  0 siblings, 0 replies; 266+ messages in thread
From: Theodore Ts'o @ 2017-11-30 14:16 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Daniel Micay, Linus Torvalds, Kees Cook, Jessica Yu, LSM List,
	Linux Kernel Mailing List, kernel-hardening

On Thu, Nov 30, 2017 at 09:50:27AM +0100, Djalal Harouni wrote:
> In embedded systems we can't maintain a SELinux policy, distro man
> power hardly manage. We have abstracted seccomp etc, but the kernel
> inherited the difficult multiplex things, plus all other paths that
> trigger this.....

> Yes, but it is hard to maintain a whitelist policy, the code is hardly
> maintained...

So this is the part that scares me to death about IOT, and why I tell
everyone to ***never*** trust an IOT device on their home network, and
***never*** trust it with anything you don't mind splattered all over
the front page of NY Times and RT / Sputnick news.

You're saying that you want to use modules (as opposed to compile
everything tightly down to just what you need for the embedded
system); that the code is "hardly maintained".  And yet we're supposed
to consider it trustworthy?

If that's the case, turning off implicit module loading sounds and
thinking that this will somehow be a magic wand sounds.... crazy.

	      	   		     - Ted

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-30 14:16                                             ` Theodore Ts'o
  (?)
@ 2017-11-30 14:51                                               ` Djalal Harouni
  -1 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-30 14:51 UTC (permalink / raw)
  To: Theodore Ts'o, Djalal Harouni, Daniel Micay, Linus Torvalds,
	Kees Cook, Jessica Yu, LSM List, Linux Kernel Mailing List,
	kernel-hardening

On Thu, Nov 30, 2017 at 3:16 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Thu, Nov 30, 2017 at 09:50:27AM +0100, Djalal Harouni wrote:
>> In embedded systems we can't maintain a SELinux policy, distro man
>> power hardly manage. We have abstracted seccomp etc, but the kernel
>> inherited the difficult multiplex things, plus all other paths that
>> trigger this.....
>
>> Yes, but it is hard to maintain a whitelist policy, the code is hardly
>> maintained...
>
> So this is the part that scares me to death about IOT, and why I tell
> everyone to ***never*** trust an IOT device on their home network, and
> ***never*** trust it with anything you don't mind splattered all over
> the front page of NY Times and RT / Sputnick news.

Yes.

For your pleasure:
https://techcrunch.com/2017/04/25/brickerbot-is-a-vigilante-worm-that-destroys-insecure-iot-devices/
 bricked million of devices to stupid busybox remote port.
https://en.wikipedia.org/wiki/Mirai_(malware)  an other million bots
used to disturb netflix, twitter and others I don't know the details.
...

> You're saying that you want to use modules (as opposed to compile
> everything tightly down to just what you need for the embedded
> system); that the code is "hardly maintained".  And yet we're supposed
> to consider it trustworthy?

I didn't say that.

> If that's the case, turning off implicit module loading sounds and
> thinking that this will somehow be a magic wand sounds.... crazy.

The product costs decide, web developers, javascript, big data
analysis, electronic engineers all want to use Linux for IoT prototype
and sell in some months, they will get any kernel+userspace add their
value on top and sell. It will be non-sense to think that if a web
developer wants to sell a node.js app as an IoT he has to compile a
kernel and do all the other stuff, they all re-use the same layer the
same config for everything. Requiring for everyone to compile its own
kernel does not make much sense. Default safe behaviour is what we
should do.

Thanks!

>                                      - Ted



-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-30 14:51                                               ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-30 14:51 UTC (permalink / raw)
  To: linux-security-module

On Thu, Nov 30, 2017 at 3:16 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Thu, Nov 30, 2017 at 09:50:27AM +0100, Djalal Harouni wrote:
>> In embedded systems we can't maintain a SELinux policy, distro man
>> power hardly manage. We have abstracted seccomp etc, but the kernel
>> inherited the difficult multiplex things, plus all other paths that
>> trigger this.....
>
>> Yes, but it is hard to maintain a whitelist policy, the code is hardly
>> maintained...
>
> So this is the part that scares me to death about IOT, and why I tell
> everyone to ***never*** trust an IOT device on their home network, and
> ***never*** trust it with anything you don't mind splattered all over
> the front page of NY Times and RT / Sputnick news.

Yes.

For your pleasure:
https://techcrunch.com/2017/04/25/brickerbot-is-a-vigilante-worm-that-destroys-insecure-iot-devices/
 bricked million of devices to stupid busybox remote port.
https://en.wikipedia.org/wiki/Mirai_(malware)  an other million bots
used to disturb netflix, twitter and others I don't know the details.
...

> You're saying that you want to use modules (as opposed to compile
> everything tightly down to just what you need for the embedded
> system); that the code is "hardly maintained".  And yet we're supposed
> to consider it trustworthy?

I didn't say that.

> If that's the case, turning off implicit module loading sounds and
> thinking that this will somehow be a magic wand sounds.... crazy.

The product costs decide, web developers, javascript, big data
analysis, electronic engineers all want to use Linux for IoT prototype
and sell in some months, they will get any kernel+userspace add their
value on top and sell. It will be non-sense to think that if a web
developer wants to sell a node.js app as an IoT he has to compile a
kernel and do all the other stuff, they all re-use the same layer the
same config for everything. Requiring for everyone to compile its own
kernel does not make much sense. Default safe behaviour is what we
should do.

Thanks!

>                                      - Ted



-- 
tixxdz
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-11-30 14:51                                               ` Djalal Harouni
  0 siblings, 0 replies; 266+ messages in thread
From: Djalal Harouni @ 2017-11-30 14:51 UTC (permalink / raw)
  To: Theodore Ts'o, Djalal Harouni, Daniel Micay, Linus Torvalds,
	Kees Cook, Jessica Yu, LSM List, Linux Kernel Mailing List,
	kernel-hardening

On Thu, Nov 30, 2017 at 3:16 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Thu, Nov 30, 2017 at 09:50:27AM +0100, Djalal Harouni wrote:
>> In embedded systems we can't maintain a SELinux policy, distro man
>> power hardly manage. We have abstracted seccomp etc, but the kernel
>> inherited the difficult multiplex things, plus all other paths that
>> trigger this.....
>
>> Yes, but it is hard to maintain a whitelist policy, the code is hardly
>> maintained...
>
> So this is the part that scares me to death about IOT, and why I tell
> everyone to ***never*** trust an IOT device on their home network, and
> ***never*** trust it with anything you don't mind splattered all over
> the front page of NY Times and RT / Sputnick news.

Yes.

For your pleasure:
https://techcrunch.com/2017/04/25/brickerbot-is-a-vigilante-worm-that-destroys-insecure-iot-devices/
 bricked million of devices to stupid busybox remote port.
https://en.wikipedia.org/wiki/Mirai_(malware)  an other million bots
used to disturb netflix, twitter and others I don't know the details.
...

> You're saying that you want to use modules (as opposed to compile
> everything tightly down to just what you need for the embedded
> system); that the code is "hardly maintained".  And yet we're supposed
> to consider it trustworthy?

I didn't say that.

> If that's the case, turning off implicit module loading sounds and
> thinking that this will somehow be a magic wand sounds.... crazy.

The product costs decide, web developers, javascript, big data
analysis, electronic engineers all want to use Linux for IoT prototype
and sell in some months, they will get any kernel+userspace add their
value on top and sell. It will be non-sense to think that if a web
developer wants to sell a node.js app as an IoT he has to compile a
kernel and do all the other stuff, they all re-use the same layer the
same config for everything. Requiring for everyone to compile its own
kernel does not make much sense. Default safe behaviour is what we
should do.

Thanks!

>                                      - Ted



-- 
tixxdz

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
  2017-11-30  0:35                     ` Theodore Ts'o
  (?)
@ 2017-11-30 17:17                       ` Serge E. Hallyn
  -1 siblings, 0 replies; 266+ messages in thread
From: Serge E. Hallyn @ 2017-11-30 17:17 UTC (permalink / raw)
  To: Theodore Ts'o, Serge E. Hallyn, David Miller, gnomes,
	keescook, mcgrof, tixxdz, luto, akpm, james.l.morris,
	ben.hutchings, solar, jeyu, rusty, linux-kernel,
	linux-security-module, kernel-hardening, corbet, mingo, netdev,
	peterz, torvalds

On Wed, Nov 29, 2017 at 07:35:31PM -0500, Theodore Ts'o wrote:
> On Wed, Nov 29, 2017 at 11:28:52AM -0600, Serge E. Hallyn wrote:
> > 
> > Just to be clear, module loading requires - and must always continue to
> > require - CAP_SYS_MODULE against the initial user namespace.  Containers
> > in user namespaces do not have that.
> > 
> > I don't believe anyone has ever claimed that containers which are not in
> > a user namespace are in any way secure.
> 
> Unless the container performs some action which causes the kernel to
> call request_module(), which then loads some kernel module,

A local unprivileged user can do the same thing.  I reject the popular
notion that linux is a single user operating system.  More interesting
are the (very real) cases where root in a container can do something
which a local unprivileged user could not do.  Since a local unprivileged
user can always create a new namespace, *those* constitute a real and
interesting problem.

> potentially containing cr*p unmaintained code which was included when
> the distro compiled the world, into the host kernel.

> This is an attack vector that doesn't exist if you are using VM's.

Until the vm tenant uses a trivial vm escape.

> And in general, the attack surface of the entire Linux
> kernel<->userspace API is far larger than that which is exposed by the
> guest<->host interface.
> 
> For that reason, containers are *far* more insecure than VM's, since
> once the attacker gets root on the guest VM, they then have to attack
> the hypervisor interface.  And if you compare the attack surface of
> the two, it's pretty clear which is larger, and it's not the
> hypervisor interface.

Any time anyone spends a day looking at either the black hole that is
the hardware emulators or the xen and kvm code itself they walk away
with a set of cve's.  It *should* be more secure, it's not.  You're
telling me your house is safe because you put up a no tresspassing
sign.

-serge

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-30 17:17                       ` Serge E. Hallyn
  0 siblings, 0 replies; 266+ messages in thread
From: Serge E. Hallyn @ 2017-11-30 17:17 UTC (permalink / raw)
  To: linux-security-module

On Wed, Nov 29, 2017 at 07:35:31PM -0500, Theodore Ts'o wrote:
> On Wed, Nov 29, 2017 at 11:28:52AM -0600, Serge E. Hallyn wrote:
> > 
> > Just to be clear, module loading requires - and must always continue to
> > require - CAP_SYS_MODULE against the initial user namespace.  Containers
> > in user namespaces do not have that.
> > 
> > I don't believe anyone has ever claimed that containers which are not in
> > a user namespace are in any way secure.
> 
> Unless the container performs some action which causes the kernel to
> call request_module(), which then loads some kernel module,

A local unprivileged user can do the same thing.  I reject the popular
notion that linux is a single user operating system.  More interesting
are the (very real) cases where root in a container can do something
which a local unprivileged user could not do.  Since a local unprivileged
user can always create a new namespace, *those* constitute a real and
interesting problem.

> potentially containing cr*p unmaintained code which was included when
> the distro compiled the world, into the host kernel.

> This is an attack vector that doesn't exist if you are using VM's.

Until the vm tenant uses a trivial vm escape.

> And in general, the attack surface of the entire Linux
> kernel<->userspace API is far larger than that which is exposed by the
> guest<->host interface.
> 
> For that reason, containers are *far* more insecure than VM's, since
> once the attacker gets root on the guest VM, they then have to attack
> the hypervisor interface.  And if you compare the attack surface of
> the two, it's pretty clear which is larger, and it's not the
> hypervisor interface.

Any time anyone spends a day looking at either the black hole that is
the hardware emulators or the xen and kvm code itself they walk away
with a set of cve's.  It *should* be more secure, it's not.  You're
telling me your house is safe because you put up a no tresspassing
sign.

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 1/5] modules:capabilities: add request_module_cap()
@ 2017-11-30 17:17                       ` Serge E. Hallyn
  0 siblings, 0 replies; 266+ messages in thread
From: Serge E. Hallyn @ 2017-11-30 17:17 UTC (permalink / raw)
  To: Theodore Ts'o, Serge E. Hallyn, David Miller, gnomes,
	keescook, mcgrof, tixxdz, luto, akpm, james.l.morris,
	ben.hutchings, solar, jeyu, rusty, linux-kernel,
	linux-security-module, kernel-hardening, corbet, mingo, netdev,
	peterz, torvalds

On Wed, Nov 29, 2017 at 07:35:31PM -0500, Theodore Ts'o wrote:
> On Wed, Nov 29, 2017 at 11:28:52AM -0600, Serge E. Hallyn wrote:
> > 
> > Just to be clear, module loading requires - and must always continue to
> > require - CAP_SYS_MODULE against the initial user namespace.  Containers
> > in user namespaces do not have that.
> > 
> > I don't believe anyone has ever claimed that containers which are not in
> > a user namespace are in any way secure.
> 
> Unless the container performs some action which causes the kernel to
> call request_module(), which then loads some kernel module,

A local unprivileged user can do the same thing.  I reject the popular
notion that linux is a single user operating system.  More interesting
are the (very real) cases where root in a container can do something
which a local unprivileged user could not do.  Since a local unprivileged
user can always create a new namespace, *those* constitute a real and
interesting problem.

> potentially containing cr*p unmaintained code which was included when
> the distro compiled the world, into the host kernel.

> This is an attack vector that doesn't exist if you are using VM's.

Until the vm tenant uses a trivial vm escape.

> And in general, the attack surface of the entire Linux
> kernel<->userspace API is far larger than that which is exposed by the
> guest<->host interface.
> 
> For that reason, containers are *far* more insecure than VM's, since
> once the attacker gets root on the guest VM, they then have to attack
> the hypervisor interface.  And if you compare the attack surface of
> the two, it's pretty clear which is larger, and it's not the
> hypervisor interface.

Any time anyone spends a day looking at either the black hole that is
the hardware emulators or the xen and kvm code itself they walk away
with a set of cve's.  It *should* be more secure, it's not.  You're
telling me your house is safe because you put up a no tresspassing
sign.

-serge

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-30  8:50                                           ` Djalal Harouni
  (?)
@ 2017-12-01  6:39                                             ` Daniel Micay
  -1 siblings, 0 replies; 266+ messages in thread
From: Daniel Micay @ 2017-12-01  6:39 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Linus Torvalds, Kees Cook, Jessica Yu, LSM List,
	Linux Kernel Mailing List, kernel-hardening

It was suggested that the feature would only be adopted in niches like
Android and I pointed out that it's not really relevant to Android.

It's a waste of time to try convincing me that it's useful elsewhere. I
never said or implied that it wasn't.

On Thu, 2017-11-30 at 09:50 +0100, Djalal Harouni wrote:
> On Thu, Nov 30, 2017 at 7:51 AM, Daniel Micay <danielmicay@gmail.com>
> wrote:
> [...]
> > Lots of potential module attack surface also gets eliminated by
> > default
> > via their SELinux whitelists for /dev, /sys, /proc, debugfs, ioctl
> > commands, etc. The global seccomp whitelist might be relevant in
> > some
> > cases too.
> 
> In embedded systems we can't maintain a SELinux policy, distro man
> power hardly manage. We have abstracted seccomp etc, but the kernel
> inherited the difficult multiplex things, plus all other paths that
> trigger this.

It's cheaper to use an existing system like Android Things where device
makers only need to make their apps and perhaps some userspace hardware
drivers for cases not covered by mainline kernel drivers. I don't think
it makes sense for every device vendor to manage an OS and I seriously
doubt that's how the ecosystem is going to end up as it matures.

> > Android devices like to build everything into the kernel too, so
> > even if
> > they weren't using a module this feature wouldn't usually help them.
> > It
> > would need to work like this existing sysctl:
> > 
> >     net.ipv4.tcp_available_congestion_control = cubic reno lp
> > 
> > i.e. whitelists for functionality offered by the modules, not just
> > whether they can be loaded.
> 
> Yes, but it is hard to maintain a whitelist policy, the code is hardly
> maintained... if you include everything you should have an LSM policy
> or something like that, and compiling kernels is expert thing.

I'm not talking about whitelist vs. blacklist, compiling kernels or
anything like that.

> Otherwise IMHO the kernel should provide default secure behaviour on
> how to load or add new functionality to the running one. From a user
> perspective, a switch "yes/no" that a privileged entity will
> *understand* and assume is what should be there, and the switch or
> flag as discussed here is local to processes, the sysctl will be
> removed. IMO it should come from userspace point of view, cause as an
> example the sysctl:
> 
> net.ipv4.tcp_available_congestion_control = cubic reno lp
> 
> Is kernel thing, too technical, userspace developers, admins or
> privileged entity will not understand what cubic or reno mean.

Congestion control algorithms are being used as an example in terms of
the mechanism being used to control which are available to unprivileged
users. The obscurity of congestion control algorithms is irrelevant.

>  Doing
> the same per functionality directly like this seems to much of a
> burden compared to the use case. The kernel maybe can do this to
> advance the art of the networking stack and for advanced cases, but in
> IMHO a sane default behaviour + an abstracted process/sandbox flag
> "yes/no" for most others, userspace developers and humans is what
> should be provided and we need the kernel to help here.

There are cases where unprivileged module auto-loading is relied upon
like network protocols. Having configuration for which protocols can be
used by unprivileged users is superior to limiting only which ones can
be auto-loaded. That's why I bought up the existing congestion control
knob. It works well in terms of having a whitelist of the sane, widely
used cases with exposing anything obscure requiring configuration. They
happen to be implemented as modules too.

Killing off unprivileged module loading other than a few cases like that
makes sense, and then those can provide similar control with similarly
sane defaults.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-12-01  6:39                                             ` Daniel Micay
  0 siblings, 0 replies; 266+ messages in thread
From: Daniel Micay @ 2017-12-01  6:39 UTC (permalink / raw)
  To: linux-security-module

It was suggested that the feature would only be adopted in niches like
Android and I pointed out that it's not really relevant to Android.

It's a waste of time to try convincing me that it's useful elsewhere. I
never said or implied that it wasn't.

On Thu, 2017-11-30 at 09:50 +0100, Djalal Harouni wrote:
> On Thu, Nov 30, 2017 at 7:51 AM, Daniel Micay <danielmicay@gmail.com>
> wrote:
> [...]
> > Lots of potential module attack surface also gets eliminated by
> > default
> > via their SELinux whitelists for /dev, /sys, /proc, debugfs, ioctl
> > commands, etc. The global seccomp whitelist might be relevant in
> > some
> > cases too.
> 
> In embedded systems we can't maintain a SELinux policy, distro man
> power hardly manage. We have abstracted seccomp etc, but the kernel
> inherited the difficult multiplex things, plus all other paths that
> trigger this.

It's cheaper to use an existing system like Android Things where device
makers only need to make their apps and perhaps some userspace hardware
drivers for cases not covered by mainline kernel drivers. I don't think
it makes sense for every device vendor to manage an OS and I seriously
doubt that's how the ecosystem is going to end up as it matures.

> > Android devices like to build everything into the kernel too, so
> > even if
> > they weren't using a module this feature wouldn't usually help them.
> > It
> > would need to work like this existing sysctl:
> > 
> >     net.ipv4.tcp_available_congestion_control = cubic reno lp
> > 
> > i.e. whitelists for functionality offered by the modules, not just
> > whether they can be loaded.
> 
> Yes, but it is hard to maintain a whitelist policy, the code is hardly
> maintained... if you include everything you should have an LSM policy
> or something like that, and compiling kernels is expert thing.

I'm not talking about whitelist vs. blacklist, compiling kernels or
anything like that.

> Otherwise IMHO the kernel should provide default secure behaviour on
> how to load or add new functionality to the running one. From a user
> perspective, a switch "yes/no" that a privileged entity will
> *understand* and assume is what should be there, and the switch or
> flag as discussed here is local to processes, the sysctl will be
> removed. IMO it should come from userspace point of view, cause as an
> example the sysctl:
> 
> net.ipv4.tcp_available_congestion_control = cubic reno lp
> 
> Is kernel thing, too technical, userspace developers, admins or
> privileged entity will not understand what cubic or reno mean.

Congestion control algorithms are being used as an example in terms of
the mechanism being used to control which are available to unprivileged
users. The obscurity of congestion control algorithms is irrelevant.

>  Doing
> the same per functionality directly like this seems to much of a
> burden compared to the use case. The kernel maybe can do this to
> advance the art of the networking stack and for advanced cases, but in
> IMHO a sane default behaviour + an abstracted process/sandbox flag
> "yes/no" for most others, userspace developers and humans is what
> should be provided and we need the kernel to help here.

There are cases where unprivileged module auto-loading is relied upon
like network protocols. Having configuration for which protocols can be
used by unprivileged users is superior to limiting only which ones can
be auto-loaded. That's why I bought up the existing congestion control
knob. It works well in terms of having a whitelist of the sane, widely
used cases with exposing anything obscure requiring configuration. They
happen to be implemented as modules too.

Killing off unprivileged module loading other than a few cases like that
makes sense, and then those can provide similar control with similarly
sane defaults.
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-12-01  6:39                                             ` Daniel Micay
  0 siblings, 0 replies; 266+ messages in thread
From: Daniel Micay @ 2017-12-01  6:39 UTC (permalink / raw)
  To: Djalal Harouni
  Cc: Linus Torvalds, Kees Cook, Jessica Yu, LSM List,
	Linux Kernel Mailing List, kernel-hardening

It was suggested that the feature would only be adopted in niches like
Android and I pointed out that it's not really relevant to Android.

It's a waste of time to try convincing me that it's useful elsewhere. I
never said or implied that it wasn't.

On Thu, 2017-11-30 at 09:50 +0100, Djalal Harouni wrote:
> On Thu, Nov 30, 2017 at 7:51 AM, Daniel Micay <danielmicay@gmail.com>
> wrote:
> [...]
> > Lots of potential module attack surface also gets eliminated by
> > default
> > via their SELinux whitelists for /dev, /sys, /proc, debugfs, ioctl
> > commands, etc. The global seccomp whitelist might be relevant in
> > some
> > cases too.
> 
> In embedded systems we can't maintain a SELinux policy, distro man
> power hardly manage. We have abstracted seccomp etc, but the kernel
> inherited the difficult multiplex things, plus all other paths that
> trigger this.

It's cheaper to use an existing system like Android Things where device
makers only need to make their apps and perhaps some userspace hardware
drivers for cases not covered by mainline kernel drivers. I don't think
it makes sense for every device vendor to manage an OS and I seriously
doubt that's how the ecosystem is going to end up as it matures.

> > Android devices like to build everything into the kernel too, so
> > even if
> > they weren't using a module this feature wouldn't usually help them.
> > It
> > would need to work like this existing sysctl:
> > 
> >     net.ipv4.tcp_available_congestion_control = cubic reno lp
> > 
> > i.e. whitelists for functionality offered by the modules, not just
> > whether they can be loaded.
> 
> Yes, but it is hard to maintain a whitelist policy, the code is hardly
> maintained... if you include everything you should have an LSM policy
> or something like that, and compiling kernels is expert thing.

I'm not talking about whitelist vs. blacklist, compiling kernels or
anything like that.

> Otherwise IMHO the kernel should provide default secure behaviour on
> how to load or add new functionality to the running one. From a user
> perspective, a switch "yes/no" that a privileged entity will
> *understand* and assume is what should be there, and the switch or
> flag as discussed here is local to processes, the sysctl will be
> removed. IMO it should come from userspace point of view, cause as an
> example the sysctl:
> 
> net.ipv4.tcp_available_congestion_control = cubic reno lp
> 
> Is kernel thing, too technical, userspace developers, admins or
> privileged entity will not understand what cubic or reno mean.

Congestion control algorithms are being used as an example in terms of
the mechanism being used to control which are available to unprivileged
users. The obscurity of congestion control algorithms is irrelevant.

>  Doing
> the same per functionality directly like this seems to much of a
> burden compared to the use case. The kernel maybe can do this to
> advance the art of the networking stack and for advanced cases, but in
> IMHO a sane default behaviour + an abstracted process/sandbox flag
> "yes/no" for most others, userspace developers and humans is what
> should be provided and we need the kernel to help here.

There are cases where unprivileged module auto-loading is relied upon
like network protocols. Having configuration for which protocols can be
used by unprivileged users is superior to limiting only which ones can
be auto-loaded. That's why I bought up the existing congestion control
knob. It works well in terms of having a whitelist of the sane, widely
used cases with exposing anything obscure requiring configuration. They
happen to be implemented as modules too.

Killing off unprivileged module loading other than a few cases like that
makes sense, and then those can provide similar control with similarly
sane defaults.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
  2017-11-28 23:49                             ` Theodore Ts'o
  (?)
@ 2017-12-01 15:22                               ` Marcus Meissner
  -1 siblings, 0 replies; 266+ messages in thread
From: Marcus Meissner @ 2017-12-01 15:22 UTC (permalink / raw)
  To: Theodore Ts'o, Kees Cook, Linus Torvalds, Djalal Harouni,
	Jonathan Corbet, James Morris, LSM List,
	Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 06:49:20PM -0500, Theodore Ts'o wrote:
> On Tue, Nov 28, 2017 at 03:29:01PM -0800, Kees Cook wrote:
> > > So in these two cases, if the kernel was built w/o modules, and HDLC
> > > and DCCP was built-in, you'd be screwed, then?
> > 
> > Sure, but that's not the common situation.
> > 
> > > Is the goal here to protect people using distro kernels which build
> > > the world as modules, including dodgy pieces of kernel code that are
> > > bug-ridden?
> > 
> > The bulk of the risk comes from distro kernels, yes. (Though "bug
> > ridden" is a strong statement. There are and will be bugs, scattered
> > across a wide portion of the kernel, it's just that modules tend to
> > cover most of that attack surface.)
> 
> OK, but if the goal is to protect users who are running distro
> kernels, then a kernel config that breaks some percentage of users is
> ****highly**** unlikely to be enabled by Red Hat and SuSE, right?  And
> many of these users either can't (from a skills perspective) or won't
> (because they lose all support from the enterprise distro's help desk)
> recompile their kernel to enable an "might break 3% of all users ---
> oh, well" config option.

Yes, breaking customers is not seen lightly.

I also (not related to this thread here, more to SLAB hardening et.al)
have a hard time getting performance losses caused by hardening features approved.
 
> Which argues for being extremely conservative about making something
> that has an extremely low probability of breaking users, and it points
> out why Geo Kozey's "who cares about breaking users; security is
> IMPORTANT" argument is so wrong-headed.
> 
> If the goal is to protect distro kernels, but a sloppy approach
> guarantees that distro kernels will never enable it, is it really
> worth it?
> 
> 	   	       	       	    	  	 - Ted
> 
> P.S.  This is where it might be useful to get some input from the Red
> Hat and SuSE support teams.  How many angry user calls to their help
> desk are they willing to field before they'll just turn off the kernel
> config option for their kernels?

Speaking for SUSE ... If something that worked for people before and
it breaks, we do get feedback. If no one used it however, we won't.

For our last major product we went over the network module list and 
disabled some for building. e.g. DCCP is no longer built. We did
not receive any complaints about missing DCCP to my knowledge.

We also seperate our modules into "regular supported" and "unsupported"
in different RPMs. The "unsupported" module packages are not shipped on
the Server product. They were shipped on the desktop as some of the WiFi
drivers were requested by customers but were considered not supportable.

We do review this supportable list between kernel version jumps.

Ciao, Marcus

^ permalink raw reply	[flat|nested] 266+ messages in thread

* [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-12-01 15:22                               ` Marcus Meissner
  0 siblings, 0 replies; 266+ messages in thread
From: Marcus Meissner @ 2017-12-01 15:22 UTC (permalink / raw)
  To: linux-security-module

On Tue, Nov 28, 2017 at 06:49:20PM -0500, Theodore Ts'o wrote:
> On Tue, Nov 28, 2017 at 03:29:01PM -0800, Kees Cook wrote:
> > > So in these two cases, if the kernel was built w/o modules, and HDLC
> > > and DCCP was built-in, you'd be screwed, then?
> > 
> > Sure, but that's not the common situation.
> > 
> > > Is the goal here to protect people using distro kernels which build
> > > the world as modules, including dodgy pieces of kernel code that are
> > > bug-ridden?
> > 
> > The bulk of the risk comes from distro kernels, yes. (Though "bug
> > ridden" is a strong statement. There are and will be bugs, scattered
> > across a wide portion of the kernel, it's just that modules tend to
> > cover most of that attack surface.)
> 
> OK, but if the goal is to protect users who are running distro
> kernels, then a kernel config that breaks some percentage of users is
> ****highly**** unlikely to be enabled by Red Hat and SuSE, right?  And
> many of these users either can't (from a skills perspective) or won't
> (because they lose all support from the enterprise distro's help desk)
> recompile their kernel to enable an "might break 3% of all users ---
> oh, well" config option.

Yes, breaking customers is not seen lightly.

I also (not related to this thread here, more to SLAB hardening et.al)
have a hard time getting performance losses caused by hardening features approved.
 
> Which argues for being extremely conservative about making something
> that has an extremely low probability of breaking users, and it points
> out why Geo Kozey's "who cares about breaking users; security is
> IMPORTANT" argument is so wrong-headed.
> 
> If the goal is to protect distro kernels, but a sloppy approach
> guarantees that distro kernels will never enable it, is it really
> worth it?
> 
> 	   	       	       	    	  	 - Ted
> 
> P.S.  This is where it might be useful to get some input from the Red
> Hat and SuSE support teams.  How many angry user calls to their help
> desk are they willing to field before they'll just turn off the kernel
> config option for their kernels?

Speaking for SUSE ... If something that worked for people before and
it breaks, we do get feedback. If no one used it however, we won't.

For our last major product we went over the network module list and 
disabled some for building. e.g. DCCP is no longer built. We did
not receive any complaints about missing DCCP to my knowledge.

We also seperate our modules into "regular supported" and "unsupported"
in different RPMs. The "unsupported" module packages are not shipped on
the Server product. They were shipped on the desktop as some of the WiFi
drivers were requested by customers but were considered not supportable.

We do review this supportable list between kernel version jumps.

Ciao, Marcus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules
@ 2017-12-01 15:22                               ` Marcus Meissner
  0 siblings, 0 replies; 266+ messages in thread
From: Marcus Meissner @ 2017-12-01 15:22 UTC (permalink / raw)
  To: Theodore Ts'o, Kees Cook, Linus Torvalds, Djalal Harouni,
	Jonathan Corbet, James Morris, LSM List,
	Linux Kernel Mailing List, kernel-hardening, Geo Kozey

On Tue, Nov 28, 2017 at 06:49:20PM -0500, Theodore Ts'o wrote:
> On Tue, Nov 28, 2017 at 03:29:01PM -0800, Kees Cook wrote:
> > > So in these two cases, if the kernel was built w/o modules, and HDLC
> > > and DCCP was built-in, you'd be screwed, then?
> > 
> > Sure, but that's not the common situation.
> > 
> > > Is the goal here to protect people using distro kernels which build
> > > the world as modules, including dodgy pieces of kernel code that are
> > > bug-ridden?
> > 
> > The bulk of the risk comes from distro kernels, yes. (Though "bug
> > ridden" is a strong statement. There are and will be bugs, scattered
> > across a wide portion of the kernel, it's just that modules tend to
> > cover most of that attack surface.)
> 
> OK, but if the goal is to protect users who are running distro
> kernels, then a kernel config that breaks some percentage of users is
> ****highly**** unlikely to be enabled by Red Hat and SuSE, right?  And
> many of these users either can't (from a skills perspective) or won't
> (because they lose all support from the enterprise distro's help desk)
> recompile their kernel to enable an "might break 3% of all users ---
> oh, well" config option.

Yes, breaking customers is not seen lightly.

I also (not related to this thread here, more to SLAB hardening et.al)
have a hard time getting performance losses caused by hardening features approved.
 
> Which argues for being extremely conservative about making something
> that has an extremely low probability of breaking users, and it points
> out why Geo Kozey's "who cares about breaking users; security is
> IMPORTANT" argument is so wrong-headed.
> 
> If the goal is to protect distro kernels, but a sloppy approach
> guarantees that distro kernels will never enable it, is it really
> worth it?
> 
> 	   	       	       	    	  	 - Ted
> 
> P.S.  This is where it might be useful to get some input from the Red
> Hat and SuSE support teams.  How many angry user calls to their help
> desk are they willing to field before they'll just turn off the kernel
> config option for their kernels?

Speaking for SUSE ... If something that worked for people before and
it breaks, we do get feedback. If no one used it however, we won't.

For our last major product we went over the network module list and 
disabled some for building. e.g. DCCP is no longer built. We did
not receive any complaints about missing DCCP to my knowledge.

We also seperate our modules into "regular supported" and "unsupported"
in different RPMs. The "unsupported" module packages are not shipped on
the Server product. They were shipped on the desktop as some of the WiFi
drivers were requested by customers but were considered not supportable.

We do review this supportable list between kernel version jumps.

Ciao, Marcus

^ permalink raw reply	[flat|nested] 266+ messages in thread

end of thread, other threads:[~2017-12-01 15:22 UTC | newest]

Thread overview: 266+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-27 17:18 [PATCH v5 next 0/5] Improve Module autoloading infrastructure Djalal Harouni
2017-11-27 17:18 ` [kernel-hardening] " Djalal Harouni
2017-11-27 17:18 ` Djalal Harouni
2017-11-27 17:18 ` [PATCH v5 next 1/5] modules:capabilities: add request_module_cap() Djalal Harouni
2017-11-27 17:18   ` [kernel-hardening] " Djalal Harouni
2017-11-27 17:18   ` Djalal Harouni
2017-11-27 18:48   ` Randy Dunlap
2017-11-27 18:48     ` [kernel-hardening] " Randy Dunlap
2017-11-27 18:48     ` Randy Dunlap
2017-11-27 21:35     ` Djalal Harouni
2017-11-27 21:35       ` [kernel-hardening] " Djalal Harouni
2017-11-27 21:35       ` Djalal Harouni
2017-11-28 19:14   ` Luis R. Rodriguez
2017-11-28 19:14     ` [kernel-hardening] " Luis R. Rodriguez
2017-11-28 19:14     ` Luis R. Rodriguez
2017-11-28 20:11     ` Kees Cook
2017-11-28 20:11       ` [kernel-hardening] " Kees Cook
2017-11-28 20:11       ` Kees Cook
2017-11-28 21:16       ` Luis R. Rodriguez
2017-11-28 21:16         ` [kernel-hardening] " Luis R. Rodriguez
2017-11-28 21:16         ` Luis R. Rodriguez
2017-11-28 21:33         ` Djalal Harouni
2017-11-28 21:33           ` [kernel-hardening] " Djalal Harouni
2017-11-28 21:33           ` Djalal Harouni
2017-11-28 22:18           ` Luis R. Rodriguez
2017-11-28 22:18             ` [kernel-hardening] " Luis R. Rodriguez
2017-11-28 22:18             ` Luis R. Rodriguez
2017-11-28 22:52             ` Djalal Harouni
2017-11-28 22:52               ` [kernel-hardening] " Djalal Harouni
2017-11-28 22:52               ` Djalal Harouni
2017-11-28 21:39         ` Kees Cook
2017-11-28 21:39           ` [kernel-hardening] " Kees Cook
2017-11-28 21:39           ` Kees Cook
2017-11-28 22:12           ` Luis R. Rodriguez
2017-11-28 22:12             ` [kernel-hardening] " Luis R. Rodriguez
2017-11-28 22:12             ` Luis R. Rodriguez
2017-11-28 22:18             ` Kees Cook
2017-11-28 22:18               ` [kernel-hardening] " Kees Cook
2017-11-28 22:18               ` Kees Cook
2017-11-28 22:48               ` Luis R. Rodriguez
2017-11-28 22:48                 ` [kernel-hardening] " Luis R. Rodriguez
2017-11-28 22:48                 ` Luis R. Rodriguez
2017-11-29  7:49                 ` Michal Kubecek
2017-11-29  7:49                   ` [kernel-hardening] " Michal Kubecek
2017-11-29  7:49                   ` Michal Kubecek
2017-11-29 13:46           ` Alan Cox
2017-11-29 13:46             ` [kernel-hardening] " Alan Cox
2017-11-29 13:46             ` Alan Cox
2017-11-29 14:50             ` David Miller
2017-11-29 14:50               ` [kernel-hardening] " David Miller
2017-11-29 14:50               ` David Miller
2017-11-29 15:54               ` Theodore Ts'o
2017-11-29 15:54                 ` [kernel-hardening] " Theodore Ts'o
2017-11-29 15:54                 ` Theodore Ts'o
2017-11-29 15:58                 ` David Miller
2017-11-29 15:58                   ` [kernel-hardening] " David Miller
2017-11-29 15:58                   ` David Miller
2017-11-29 16:29                   ` Theodore Ts'o
2017-11-29 16:29                     ` [kernel-hardening] " Theodore Ts'o
2017-11-29 16:29                     ` Theodore Ts'o
2017-11-29 22:45                   ` Linus Torvalds
2017-11-29 22:45                     ` [kernel-hardening] " Linus Torvalds
2017-11-29 22:45                     ` Linus Torvalds
2017-11-29 22:45                     ` Linus Torvalds
2017-11-30  0:06                     ` Kees Cook
2017-11-30  0:06                       ` [kernel-hardening] " Kees Cook
2017-11-30  0:06                       ` Kees Cook
2017-11-30  0:06                       ` Kees Cook
2017-11-29 17:28                 ` Serge E. Hallyn
2017-11-29 17:28                   ` [kernel-hardening] " Serge E. Hallyn
2017-11-29 17:28                   ` Serge E. Hallyn
2017-11-30  0:35                   ` Theodore Ts'o
2017-11-30  0:35                     ` [kernel-hardening] " Theodore Ts'o
2017-11-30  0:35                     ` Theodore Ts'o
2017-11-30 17:17                     ` Serge E. Hallyn
2017-11-30 17:17                       ` [kernel-hardening] " Serge E. Hallyn
2017-11-30 17:17                       ` Serge E. Hallyn
2017-11-28 20:18     ` Djalal Harouni
2017-11-28 20:18       ` [kernel-hardening] " Djalal Harouni
2017-11-28 20:18       ` Djalal Harouni
2017-11-27 17:18 ` [PATCH v5 next 2/5] modules:capabilities: add cap_kernel_module_request() permission check Djalal Harouni
2017-11-27 17:18   ` [kernel-hardening] " Djalal Harouni
2017-11-27 17:18   ` Djalal Harouni
2017-11-30  2:05   ` Luis R. Rodriguez
2017-11-30  2:05     ` [kernel-hardening] " Luis R. Rodriguez
2017-11-30  2:05     ` Luis R. Rodriguez
2017-11-27 17:18 ` [PATCH v5 next 3/5] modules:capabilities: automatic module loading restriction Djalal Harouni
2017-11-27 17:18   ` [kernel-hardening] " Djalal Harouni
2017-11-27 17:18   ` Djalal Harouni
2017-11-30  1:23   ` Luis R. Rodriguez
2017-11-30  1:23     ` [kernel-hardening] " Luis R. Rodriguez
2017-11-30  1:23     ` Luis R. Rodriguez
2017-11-30 12:22     ` Djalal Harouni
2017-11-30 12:22       ` [kernel-hardening] " Djalal Harouni
2017-11-30 12:22       ` Djalal Harouni
2017-11-27 17:18 ` [PATCH v5 next 4/5] modules:capabilities: add a per-task modules auto-load mode Djalal Harouni
2017-11-27 17:18   ` [kernel-hardening] " Djalal Harouni
2017-11-27 17:18   ` Djalal Harouni
2017-11-27 17:18 ` [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules Djalal Harouni
2017-11-27 17:18   ` [kernel-hardening] " Djalal Harouni
2017-11-27 17:18   ` Djalal Harouni
2017-11-27 18:44   ` Linus Torvalds
2017-11-27 18:44     ` [kernel-hardening] " Linus Torvalds
2017-11-27 18:44     ` Linus Torvalds
2017-11-27 18:44     ` Linus Torvalds
2017-11-27 21:41     ` Djalal Harouni
2017-11-27 21:41       ` [kernel-hardening] " Djalal Harouni
2017-11-27 21:41       ` Djalal Harouni
2017-11-27 21:41       ` Djalal Harouni
2017-11-27 22:04       ` Linus Torvalds
2017-11-27 22:04         ` [kernel-hardening] " Linus Torvalds
2017-11-27 22:04         ` Linus Torvalds
2017-11-27 22:04         ` Linus Torvalds
2017-11-27 22:59         ` Kees Cook
2017-11-27 22:59           ` [kernel-hardening] " Kees Cook
2017-11-27 22:59           ` Kees Cook
2017-11-27 22:59           ` Kees Cook
2017-11-27 23:14           ` Linus Torvalds
2017-11-27 23:14             ` [kernel-hardening] " Linus Torvalds
2017-11-27 23:14             ` Linus Torvalds
2017-11-27 23:14             ` Linus Torvalds
2017-11-27 23:19             ` Kees Cook
2017-11-27 23:19               ` [kernel-hardening] " Kees Cook
2017-11-27 23:19               ` Kees Cook
2017-11-27 23:19               ` Kees Cook
2017-11-27 23:35               ` Linus Torvalds
2017-11-27 23:35                 ` [kernel-hardening] " Linus Torvalds
2017-11-27 23:35                 ` Linus Torvalds
2017-11-27 23:35                 ` Linus Torvalds
2017-11-28  1:23             ` Kees Cook
2017-11-28  1:23               ` [kernel-hardening] " Kees Cook
2017-11-28  1:23               ` Kees Cook
2017-11-28  1:23               ` Kees Cook
2017-11-28 12:16         ` [kernel-hardening] " Geo Kozey
2017-11-28 12:16           ` Geo Kozey
2017-11-28 12:16           ` Geo Kozey
2017-11-28 19:32           ` Theodore Ts'o
2017-11-28 19:32             ` Theodore Ts'o
2017-11-28 19:32             ` Theodore Ts'o
2017-11-28 20:08             ` Kees Cook
2017-11-28 20:08               ` Kees Cook
2017-11-28 20:08               ` Kees Cook
2017-11-28 20:12               ` Linus Torvalds
2017-11-28 20:12                 ` Linus Torvalds
2017-11-28 20:12                 ` Linus Torvalds
2017-11-28 20:20                 ` Kees Cook
2017-11-28 20:20                   ` Kees Cook
2017-11-28 20:20                   ` Kees Cook
2017-11-28 20:33                   ` Linus Torvalds
2017-11-28 20:33                     ` Linus Torvalds
2017-11-28 20:33                     ` Linus Torvalds
2017-11-28 21:10                     ` Djalal Harouni
2017-11-28 21:10                       ` Djalal Harouni
2017-11-28 21:10                       ` Djalal Harouni
2017-11-28 21:33                     ` Kees Cook
2017-11-28 21:33                       ` Kees Cook
2017-11-28 21:33                       ` Kees Cook
2017-11-28 23:23                       ` Theodore Ts'o
2017-11-28 23:23                         ` Theodore Ts'o
2017-11-28 23:23                         ` Theodore Ts'o
2017-11-28 23:29                         ` Kees Cook
2017-11-28 23:29                           ` Kees Cook
2017-11-28 23:29                           ` Kees Cook
2017-11-28 23:49                           ` Theodore Ts'o
2017-11-28 23:49                             ` Theodore Ts'o
2017-11-28 23:49                             ` Theodore Ts'o
2017-11-29  0:18                             ` Kees Cook
2017-11-29  0:18                               ` Kees Cook
2017-11-29  0:18                               ` Kees Cook
2017-11-29  6:36                               ` Theodore Ts'o
2017-11-29  6:36                                 ` Theodore Ts'o
2017-11-29  6:36                                 ` Theodore Ts'o
2017-11-29 14:46                             ` Geo Kozey
2017-11-29 14:46                               ` Geo Kozey
2017-11-29 14:46                               ` Geo Kozey
2017-12-01 15:22                             ` Marcus Meissner
2017-12-01 15:22                               ` Marcus Meissner
2017-12-01 15:22                               ` Marcus Meissner
2017-11-28 23:53                         ` Djalal Harouni
2017-11-28 23:53                           ` Djalal Harouni
2017-11-28 23:53                           ` Djalal Harouni
2017-11-28 21:51                     ` Geo Kozey
2017-11-28 21:51                       ` Geo Kozey
2017-11-28 21:51                       ` Geo Kozey
2017-11-28 23:51                       ` Linus Torvalds
2017-11-28 23:51                         ` Linus Torvalds
2017-11-28 23:51                         ` Linus Torvalds
2017-11-29  0:17                         ` Linus Torvalds
2017-11-29  0:17                           ` Linus Torvalds
2017-11-29  0:17                           ` Linus Torvalds
2017-11-29  0:26                           ` Kees Cook
2017-11-29  0:26                             ` Kees Cook
2017-11-29  0:26                             ` Kees Cook
2017-11-29  0:50                             ` Linus Torvalds
2017-11-29  0:50                               ` Linus Torvalds
2017-11-29  0:50                               ` Linus Torvalds
2017-11-29  4:26                               ` Eric W. Biederman
2017-11-29  4:26                                 ` Eric W. Biederman
2017-11-29  4:26                                 ` Eric W. Biederman
2017-11-29 18:30                               ` Kees Cook
2017-11-29 18:30                                 ` Kees Cook
2017-11-29 18:30                                 ` Kees Cook
2017-11-29 18:46                                 ` Linus Torvalds
2017-11-29 18:46                                   ` Linus Torvalds
2017-11-29 18:46                                   ` Linus Torvalds
2017-11-29 18:53                                   ` Linus Torvalds
2017-11-29 18:53                                     ` Linus Torvalds
2017-11-29 18:53                                     ` Linus Torvalds
2017-11-29 21:17                                   ` Kees Cook
2017-11-29 21:17                                     ` Kees Cook
2017-11-29 21:17                                     ` Kees Cook
2017-11-29 22:14                                     ` Linus Torvalds
2017-11-29 22:14                                       ` Linus Torvalds
2017-11-29 22:14                                       ` Linus Torvalds
2017-11-30  0:44                                       ` Kees Cook
2017-11-30  0:44                                         ` Kees Cook
2017-11-30  0:44                                         ` Kees Cook
2017-11-30  2:08                                         ` Linus Torvalds
2017-11-30  2:08                                           ` Linus Torvalds
2017-11-30  2:08                                           ` Linus Torvalds
2017-11-30  6:51                                       ` Daniel Micay
2017-11-30  6:51                                         ` Daniel Micay
2017-11-30  6:51                                         ` Daniel Micay
2017-11-30  8:50                                         ` Djalal Harouni
2017-11-30  8:50                                           ` Djalal Harouni
2017-11-30  8:50                                           ` Djalal Harouni
2017-11-30 14:16                                           ` Theodore Ts'o
2017-11-30 14:16                                             ` Theodore Ts'o
2017-11-30 14:16                                             ` Theodore Ts'o
2017-11-30 14:51                                             ` Djalal Harouni
2017-11-30 14:51                                               ` Djalal Harouni
2017-11-30 14:51                                               ` Djalal Harouni
2017-12-01  6:39                                           ` Daniel Micay
2017-12-01  6:39                                             ` Daniel Micay
2017-12-01  6:39                                             ` Daniel Micay
2017-11-29 15:28                           ` Geo Kozey
2017-11-29 15:28                             ` Geo Kozey
2017-11-29 15:28                             ` Geo Kozey
2017-11-27 18:41 ` [PATCH v5 next 0/5] Improve Module autoloading infrastructure Linus Torvalds
2017-11-27 18:41   ` [kernel-hardening] " Linus Torvalds
2017-11-27 18:41   ` Linus Torvalds
2017-11-27 18:41   ` Linus Torvalds
2017-11-27 19:02   ` Linus Torvalds
2017-11-27 19:02     ` [kernel-hardening] " Linus Torvalds
2017-11-27 19:02     ` Linus Torvalds
2017-11-27 19:02     ` Linus Torvalds
2017-11-27 19:12     ` Linus Torvalds
2017-11-27 19:12       ` [kernel-hardening] " Linus Torvalds
2017-11-27 19:12       ` Linus Torvalds
2017-11-27 19:12       ` Linus Torvalds
2017-11-27 21:31       ` Djalal Harouni
2017-11-27 21:31         ` [kernel-hardening] " Djalal Harouni
2017-11-27 21:31         ` Djalal Harouni
2017-11-27 21:31         ` Djalal Harouni
2017-11-27 19:14   ` David Miller
2017-11-27 19:14     ` [kernel-hardening] " David Miller
2017-11-27 19:14     ` David Miller
2017-11-27 22:31     ` James Morris
2017-11-27 22:31       ` [kernel-hardening] " James Morris
2017-11-27 22:31       ` James Morris
2017-11-27 23:04       ` Kees Cook
2017-11-27 23:04         ` [kernel-hardening] " Kees Cook
2017-11-27 23:04         ` Kees Cook
2017-11-27 23:44         ` James Morris
2017-11-27 23:44           ` [kernel-hardening] " James Morris
2017-11-27 23:44           ` James Morris

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.