linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/2] Add capabilities file to sysfs
@ 2022-01-20 18:01 Francis Laniel
  2022-01-20 18:01 ` [RFC PATCH v3 1/2] capability: Add cap_string Francis Laniel
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Francis Laniel @ 2022-01-20 18:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-security-module, Serge Hallyn, Casey Schaufler, Francis Laniel

Hi.


First, I hope you are fine and the same for your relatives.

Capabilities are used to check if a thread has the right to perform a given
action [1].
For example, a thread with CAP_BPF set can use the bpf() syscall.

Capabilities are used in the container world.
In terms of code, several projects related to container maintain code where the
capabilities are written alike include/uapi/linux/capability.h [2][3][4][5].
For these projects, their codebase should be updated when a new capability is
added to the kernel.
Some other projects rely on <sys/capability.h> [6].
In this case, this header file should reflect the capabilities offered by the
kernel.

So, in this series, I added a new file to sysfs:
/sys/kernel/security/capabilities.
The goal of this file is to be used by "container world" software to know kernel
capabilities at run time instead of compile time.

The "file" is read-only and its content is the capability number associated with
the capability name:
root@vm-amd64:~# cat /sys/kernel/security/capabilities
0       CAP_CHOWN
1       CAP_DAC_OVERRIDE
...
40      CAP_CHECKPOINT_RESTORE

The kernel already exposes the last capability number under:
/proc/sys/kernel/cap_last_cap
So, I think there should not be any issue exposing all the capabilities it
offers.
If there is any, please share it as I do not want to introduce issue with this
series.

Also, if you see any way to improve this series please share it as it would
increase this contribution quality.

Change since v2:
* Use a char * for cap_string instead of an array, each line of this char *
contains the capability number and its name.
* Move the file under /sys/kernel/security instead of /sys/kernel.

Francis Laniel (2):
  capability: Add cap_string.
  security/inode.c: Add capabilities file.

 include/uapi/linux/capability.h |  1 +
 kernel/capability.c             | 45 +++++++++++++++++++++++++++++++++
 security/inode.c                | 16 ++++++++++++
 3 files changed, 62 insertions(+)


Best regards and thank you in advance for your reviews.
---
[1] man capabilities
[2] https://github.com/containerd/containerd/blob/1a078e6893d07fec10a4940a5664fab21d6f7d1e/pkg/cap/cap_linux.go#L135
[3] https://github.com/moby/moby/commit/485cf38d48e7111b3d1f584d5e9eab46a902aabc#diff-2e04625b209932e74c617de96682ed72fbd1bb0d0cb9fb7c709cf47a86b6f9c1
moby relies on containerd code.
[4] https://github.com/syndtr/gocapability/blob/42c35b4376354fd554efc7ad35e0b7f94e3a0ffb/capability/enum.go#L47
[5] https://github.com/opencontainers/runc/blob/00f56786bb220b55b41748231880ba0e6380519a/libcontainer/capabilities/capabilities.go#L12
runc relies on syndtr package.
[6] https://github.com/containers/crun/blob/fafb556f09e6ffd4690c452ff51856b880c089f1/src/libcrun/linux.c#L35
-- 
2.30.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH v3 1/2] capability: Add cap_string.
  2022-01-20 18:01 [RFC PATCH v3 0/2] Add capabilities file to sysfs Francis Laniel
@ 2022-01-20 18:01 ` Francis Laniel
  2022-01-20 18:01 ` [RFC PATCH v3 2/2] security/inode.c: Add capabilities file Francis Laniel
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Francis Laniel @ 2022-01-20 18:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-security-module, Serge Hallyn, Casey Schaufler, Francis Laniel

This string contains on each line the number of the capability associated to
its name.
For example, first line is:
__stringify(CAP_CHOWN) "\tCAP_CHOWN\n"
which the preprocessor will replace by:
"0\tCAP_CHOWN\n"

Signed-off-by: Francis Laniel <flaniel@linux.microsoft.com>
---
 include/uapi/linux/capability.h |  1 +
 kernel/capability.c             | 45 +++++++++++++++++++++++++++++++++
 2 files changed, 46 insertions(+)

diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h
index 463d1ba2232a..115f4fef00da 100644
--- a/include/uapi/linux/capability.h
+++ b/include/uapi/linux/capability.h
@@ -428,5 +428,6 @@ struct vfs_ns_cap_data {
 #define CAP_TO_INDEX(x)     ((x) >> 5)        /* 1 << 5 == bits in __u32 */
 #define CAP_TO_MASK(x)      (1 << ((x) & 31)) /* mask for indexed __u32 */
 
+extern const char *cap_string;
 
 #endif /* _UAPI_LINUX_CAPABILITY_H */
diff --git a/kernel/capability.c b/kernel/capability.c
index 46a361dde042..72b0aece4f81 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -15,6 +15,7 @@
 #include <linux/mm.h>
 #include <linux/export.h>
 #include <linux/security.h>
+#include <linux/stringify.h>
 #include <linux/syscalls.h>
 #include <linux/pid_namespace.h>
 #include <linux/user_namespace.h>
@@ -27,6 +28,50 @@
 const kernel_cap_t __cap_empty_set = CAP_EMPTY_SET;
 EXPORT_SYMBOL(__cap_empty_set);
 
+const char *cap_string =
+	__stringify(CAP_CHOWN) "\tCAP_CHOWN\n"
+	__stringify(CAP_DAC_OVERRIDE) "\tCAP_DAC_OVERRIDE\n"
+	__stringify(CAP_DAC_READ_SEARCH) "\tCAP_DAC_READ_SEARCH\n"
+	__stringify(CAP_FOWNER) "\tCAP_FOWNER\n"
+	__stringify(CAP_FSETID) "\tCAP_FSETID\n"
+	__stringify(CAP_KILL) "\tCAP_KILL\n"
+	__stringify(CAP_SETGID) "\tCAP_SETGID\n"
+	__stringify(CAP_SETUID) "\tCAP_SETUID\n"
+	__stringify(CAP_SETPCAP) "\tCAP_SETPCAP\n"
+	__stringify(CAP_LINUX_IMMUTABLE) "\tCAP_LINUX_IMMUTABLE\n"
+	__stringify(CAP_NET_BIND_SERVICE) "\tCAP_NET_BIND_SERVICE\n"
+	__stringify(CAP_NET_BROADCAST) "\tCAP_NET_BROADCAST\n"
+	__stringify(CAP_NET_ADMIN) "\tCAP_NET_ADMIN\n"
+	__stringify(CAP_NET_RAW) "\tCAP_NET_RAW\n"
+	__stringify(CAP_IPC_LOCK) "\tCAP_IPC_LOCK\n"
+	__stringify(CAP_IPC_OWNER) "\tCAP_IPC_OWNER\n"
+	__stringify(CAP_SYS_MODULE) "\tCAP_SYS_MODULE\n"
+	__stringify(CAP_SYS_RAWIO) "\tCAP_SYS_RAWIO\n"
+	__stringify(CAP_SYS_CHROOT) "\tCAP_SYS_CHROOT\n"
+	__stringify(CAP_SYS_PTRACE) "\tCAP_SYS_PTRACE\n"
+	__stringify(CAP_SYS_PACCT) "\tCAP_SYS_PACCT\n"
+	__stringify(CAP_SYS_ADMIN) "\tCAP_SYS_ADMIN\n"
+	__stringify(CAP_SYS_BOOT) "\tCAP_SYS_BOOT\n"
+	__stringify(CAP_SYS_NICE) "\tCAP_SYS_NICE\n"
+	__stringify(CAP_SYS_RESOURCE) "\tCAP_SYS_RESOURCE\n"
+	__stringify(CAP_SYS_TIME) "\tCAP_SYS_TIME\n"
+	__stringify(CAP_SYS_TTY_CONFIG) "\tCAP_SYS_TTY_CONFIG\n"
+	__stringify(CAP_MKNOD) "\tCAP_MKNOD\n"
+	__stringify(CAP_LEASE) "\tCAP_LEASE\n"
+	__stringify(CAP_AUDIT_WRITE) "\tCAP_AUDIT_WRITE\n"
+	__stringify(CAP_AUDIT_CONTROL) "\tCAP_AUDIT_CONTROL\n"
+	__stringify(CAP_SETFCAP) "\tCAP_SETFCAP\n"
+	__stringify(CAP_MAC_OVERRIDE) "\tCAP_MAC_OVERRIDE\n"
+	__stringify(CAP_MAC_ADMIN) "\tCAP_MAC_ADMIN\n"
+	__stringify(CAP_SYSLOG) "\tCAP_SYSLOG\n"
+	__stringify(CAP_WAKE_ALARM) "\tCAP_WAKE_ALARM\n"
+	__stringify(CAP_BLOCK_SUSPEND) "\tCAP_BLOCK_SUSPEND\n"
+	__stringify(CAP_AUDIT_READ) "\tCAP_AUDIT_READ\n"
+	__stringify(CAP_PERFMON) "\tCAP_PERFMON\n"
+	__stringify(CAP_BPF) "\tCAP_BPF\n"
+	__stringify(CAP_CHECKPOINT_RESTORE) "\tCAP_CHECKPOINT_RESTORE\n"
+;
+
 int file_caps_enabled = 1;
 
 static int __init file_caps_disable(char *str)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH v3 2/2] security/inode.c: Add capabilities file.
  2022-01-20 18:01 [RFC PATCH v3 0/2] Add capabilities file to sysfs Francis Laniel
  2022-01-20 18:01 ` [RFC PATCH v3 1/2] capability: Add cap_string Francis Laniel
@ 2022-01-20 18:01 ` Francis Laniel
  2022-01-21  8:58   ` Francis Laniel
  2022-01-20 18:09 ` [RFC PATCH v3 0/2] Add capabilities file to sysfs Casey Schaufler
  2022-01-20 18:20 ` Eric W. Biederman
  3 siblings, 1 reply; 8+ messages in thread
From: Francis Laniel @ 2022-01-20 18:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-security-module, Serge Hallyn, Casey Schaufler, Francis Laniel

This new read-only file prints the capabilities values with their names:
cat /sys/kernel/security/capabilities
0       CAP_CHOWN
1       CAP_DAC_OVERRIDE
...
40      CAP_CHECKPOINT_RESTORE

Signed-off-by: Francis Laniel <flaniel@linux.microsoft.com>
---
 security/inode.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/security/inode.c b/security/inode.c
index 6c326939750d..cef78b497bab 100644
--- a/security/inode.c
+++ b/security/inode.c
@@ -21,6 +21,7 @@
 #include <linux/security.h>
 #include <linux/lsm_hooks.h>
 #include <linux/magic.h>
+#include <linux/capability.h>
 
 static struct vfsmount *mount;
 static int mount_count;
@@ -328,6 +329,19 @@ static const struct file_operations lsm_ops = {
 };
 #endif
 
+static struct dentry *capabilities_dentry;
+static ssize_t capabilities_read(struct file *unused, char __user *buf,
+				 size_t count, loff_t *ppos)
+{
+	return simple_read_from_buffer(buf, count, ppos, cap_string,
+				       strlen(cap_string));
+}
+
+static const struct file_operations capabilities_ops = {
+	.read = capabilities_read,
+	.llseek = generic_file_llseek,
+};
+
 static int __init securityfs_init(void)
 {
 	int retval;
@@ -345,6 +359,8 @@ static int __init securityfs_init(void)
 	lsm_dentry = securityfs_create_file("lsm", 0444, NULL, NULL,
 						&lsm_ops);
 #endif
+	capabilities_dentry = securityfs("capabilities", 0444, NULL, NULL,
+					 capabilities_ops);
 	return 0;
 }
 core_initcall(securityfs_init);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 0/2] Add capabilities file to sysfs
  2022-01-20 18:01 [RFC PATCH v3 0/2] Add capabilities file to sysfs Francis Laniel
  2022-01-20 18:01 ` [RFC PATCH v3 1/2] capability: Add cap_string Francis Laniel
  2022-01-20 18:01 ` [RFC PATCH v3 2/2] security/inode.c: Add capabilities file Francis Laniel
@ 2022-01-20 18:09 ` Casey Schaufler
  2022-01-20 18:14   ` Francis Laniel
  2022-01-20 18:20 ` Eric W. Biederman
  3 siblings, 1 reply; 8+ messages in thread
From: Casey Schaufler @ 2022-01-20 18:09 UTC (permalink / raw)
  To: Francis Laniel, linux-kernel
  Cc: linux-security-module, Serge Hallyn, Casey Schaufler

On 1/20/2022 10:01 AM, Francis Laniel wrote:
> Hi.
>
>
> First, I hope you are fine and the same for your relatives.
>
> Capabilities are used to check if a thread has the right to perform a given
> action [1].
> For example, a thread with CAP_BPF set can use the bpf() syscall.
>
> Capabilities are used in the container world.
> In terms of code, several projects related to container maintain code where the
> capabilities are written alike include/uapi/linux/capability.h [2][3][4][5].
> For these projects, their codebase should be updated when a new capability is
> added to the kernel.
> Some other projects rely on <sys/capability.h> [6].
> In this case, this header file should reflect the capabilities offered by the
> kernel.
>
> So, in this series, I added a new file to sysfs:
> /sys/kernel/security/capabilities.
> The goal of this file is to be used by "container world" software to know kernel
> capabilities at run time instead of compile time.
>
> The "file" is read-only and its content is the capability number associated with
> the capability name:
> root@vm-amd64:~# cat /sys/kernel/security/capabilities
> 0       CAP_CHOWN
> 1       CAP_DAC_OVERRIDE
> ...
> 40      CAP_CHECKPOINT_RESTORE
>
> The kernel already exposes the last capability number under:
> /proc/sys/kernel/cap_last_cap
> So, I think there should not be any issue exposing all the capabilities it
> offers.
> If there is any, please share it as I do not want to introduce issue with this
> series.
>
> Also, if you see any way to improve this series please share it as it would
> increase this contribution quality.
>
> Change since v2:
> * Use a char * for cap_string instead of an array, each line of this char *
> contains the capability number and its name.
> * Move the file under /sys/kernel/security instead of /sys/kernel.
>
> Francis Laniel (2):
>    capability: Add cap_string.
>    security/inode.c: Add capabilities file.
>
>   include/uapi/linux/capability.h |  1 +
>   kernel/capability.c             | 45 +++++++++++++++++++++++++++++++++
>   security/inode.c                | 16 ++++++++++++
>   3 files changed, 62 insertions(+)

For the series:

Acked-by: Casey Schaufler <casey@schaufler-ca.com>

>
>
> Best regards and thank you in advance for your reviews.
> ---
> [1] man capabilities
> [2] https://github.com/containerd/containerd/blob/1a078e6893d07fec10a4940a5664fab21d6f7d1e/pkg/cap/cap_linux.go#L135
> [3] https://github.com/moby/moby/commit/485cf38d48e7111b3d1f584d5e9eab46a902aabc#diff-2e04625b209932e74c617de96682ed72fbd1bb0d0cb9fb7c709cf47a86b6f9c1
> moby relies on containerd code.
> [4] https://github.com/syndtr/gocapability/blob/42c35b4376354fd554efc7ad35e0b7f94e3a0ffb/capability/enum.go#L47
> [5] https://github.com/opencontainers/runc/blob/00f56786bb220b55b41748231880ba0e6380519a/libcontainer/capabilities/capabilities.go#L12
> runc relies on syndtr package.
> [6] https://github.com/containers/crun/blob/fafb556f09e6ffd4690c452ff51856b880c089f1/src/libcrun/linux.c#L35

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 0/2] Add capabilities file to sysfs
  2022-01-20 18:09 ` [RFC PATCH v3 0/2] Add capabilities file to sysfs Casey Schaufler
@ 2022-01-20 18:14   ` Francis Laniel
  0 siblings, 0 replies; 8+ messages in thread
From: Francis Laniel @ 2022-01-20 18:14 UTC (permalink / raw)
  To: linux-kernel, Casey Schaufler; +Cc: linux-security-module, Serge Hallyn

Le jeudi 20 janvier 2022, 19:09:50 CET Casey Schaufler a écrit :
> On 1/20/2022 10:01 AM, Francis Laniel wrote:
> > Hi.
> > 
> > 
> > First, I hope you are fine and the same for your relatives.
> > 
> > Capabilities are used to check if a thread has the right to perform a
> > given
> > action [1].
> > For example, a thread with CAP_BPF set can use the bpf() syscall.
> > 
> > Capabilities are used in the container world.
> > In terms of code, several projects related to container maintain code
> > where the capabilities are written alike include/uapi/linux/capability.h
> > [2][3][4][5]. For these projects, their codebase should be updated when a
> > new capability is added to the kernel.
> > Some other projects rely on <sys/capability.h> [6].
> > In this case, this header file should reflect the capabilities offered by
> > the kernel.
> > 
> > So, in this series, I added a new file to sysfs:
> > /sys/kernel/security/capabilities.
> > The goal of this file is to be used by "container world" software to know
> > kernel capabilities at run time instead of compile time.
> > 
> > The "file" is read-only and its content is the capability number
> > associated with the capability name:
> > root@vm-amd64:~# cat /sys/kernel/security/capabilities
> > 0       CAP_CHOWN
> > 1       CAP_DAC_OVERRIDE
> > ...
> > 40      CAP_CHECKPOINT_RESTORE
> > 
> > The kernel already exposes the last capability number under:
> > /proc/sys/kernel/cap_last_cap
> > So, I think there should not be any issue exposing all the capabilities it
> > offers.
> > If there is any, please share it as I do not want to introduce issue with
> > this series.
> > 
> > Also, if you see any way to improve this series please share it as it
> > would
> > increase this contribution quality.
> > 
> > Change since v2:
> > * Use a char * for cap_string instead of an array, each line of this char
> > *
> > contains the capability number and its name.
> > * Move the file under /sys/kernel/security instead of /sys/kernel.
> > 
> > Francis Laniel (2):
> >    capability: Add cap_string.
> >    security/inode.c: Add capabilities file.
> >   
> >   include/uapi/linux/capability.h |  1 +
> >   kernel/capability.c             | 45 +++++++++++++++++++++++++++++++++
> >   security/inode.c                | 16 ++++++++++++
> >   3 files changed, 62 insertions(+)
> 
> For the series:
> 
> Acked-by: Casey Schaufler <casey@schaufler-ca.com>

Thank you!
I will wait to get some comments on the v3 and send a v4 with your tag added!

> > Best regards and thank you in advance for your reviews.
> > ---
> > [1] man capabilities
> > [2]
> > https://github.com/containerd/containerd/blob/1a078e6893d07fec10a4940a566
> > 4fab21d6f7d1e/pkg/cap/cap_linux.go#L135 [3]
> > https://github.com/moby/moby/commit/485cf38d48e7111b3d1f584d5e9eab46a902a
> > abc#diff-2e04625b209932e74c617de96682ed72fbd1bb0d0cb9fb7c709cf47a86b6f9c1
> > moby relies on containerd code.
> > [4]
> > https://github.com/syndtr/gocapability/blob/42c35b4376354fd554efc7ad35e0b
> > 7f94e3a0ffb/capability/enum.go#L47 [5]
> > https://github.com/opencontainers/runc/blob/00f56786bb220b55b41748231880b
> > a0e6380519a/libcontainer/capabilities/capabilities.go#L12 runc relies on
> > syndtr package.
> > [6]
> > https://github.com/containers/crun/blob/fafb556f09e6ffd4690c452ff51856b88
> > 0c089f1/src/libcrun/linux.c#L35





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 0/2] Add capabilities file to sysfs
  2022-01-20 18:01 [RFC PATCH v3 0/2] Add capabilities file to sysfs Francis Laniel
                   ` (2 preceding siblings ...)
  2022-01-20 18:09 ` [RFC PATCH v3 0/2] Add capabilities file to sysfs Casey Schaufler
@ 2022-01-20 18:20 ` Eric W. Biederman
  2022-01-20 19:01   ` Francis Laniel
  3 siblings, 1 reply; 8+ messages in thread
From: Eric W. Biederman @ 2022-01-20 18:20 UTC (permalink / raw)
  To: Francis Laniel
  Cc: linux-kernel, linux-security-module, Serge Hallyn, Casey Schaufler

Francis Laniel <flaniel@linux.microsoft.com> writes:

> Hi.
>
>
> First, I hope you are fine and the same for your relatives.
>
>
> Capabilities are used to check if a thread has the right to perform a given
> action [1].
> For example, a thread with CAP_BPF set can use the bpf() syscall.
>
> Capabilities are used in the container world.
> In terms of code, several projects related to container maintain code where the
> capabilities are written alike include/uapi/linux/capability.h [2][3][4][5].
> For these projects, their codebase should be updated when a new capability is
> added to the kernel.
> Some other projects rely on <sys/capability.h> [6].
> In this case, this header file should reflect the capabilities offered by the
> kernel.
>
> So, in this series, I added a new file to sysfs:
> /sys/kernel/security/capabilities.

Actually that is a file in securityfs.  Which is related but slightly
different.  For sysfs this would be immediately unacceptable as it
breaks the one value per file rule.  I don't know what the rules
are for securityfs but I do know files that contain many many lines
and get very large tend to be problematic in both their kernel
implementation and in userspace parsing speed.

So I am looking for what the advantage of this file that justifies the
cost of maintaining it.

> The goal of this file is to be used by "container world" software to know kernel
> capabilities at run time instead of compile time.

I don't understand the problem you are trying to solve.  If the software
needs to updated what benefit is there for all of the information to be
available at runtime?

>
> The "file" is read-only and its content is the capability number associated with
> the capability name:
> root@vm-amd64:~# cat /sys/kernel/security/capabilities
> 0       CAP_CHOWN
> 1       CAP_DAC_OVERRIDE
> ...
> 40      CAP_CHECKPOINT_RESTORE
>

> The kernel already exposes the last capability number under:
> /proc/sys/kernel/cap_last_cap
> So, I think there should not be any issue exposing all the capabilities it
> offers.
> If there is any, please share it as I do not want to introduce issue with this
> series.

The mapping between capabilities and numbers should never change it is
ABI.  I seem to remember a version number in the file capability so that
if the mappings do change that number can be changed in a way that
existing software is not confused.

What is the advantage in printing all of the mappings?
>
> Also, if you see any way to improve this series please share it as it would
> increase this contribution quality.
>
> Change since v2:
> * Use a char * for cap_string instead of an array, each line of this char *
> contains the capability number and its name.
> * Move the file under /sys/kernel/security instead of /sys/kernel.
>
> Francis Laniel (2):
>   capability: Add cap_string.
>   security/inode.c: Add capabilities file.
>
>  include/uapi/linux/capability.h |  1 +
>  kernel/capability.c             | 45 +++++++++++++++++++++++++++++++++
>  security/inode.c                | 16 ++++++++++++
>  3 files changed, 62 insertions(+)
>
>
> Best regards and thank you in advance for your reviews.
> ---
> [1] man capabilities
> [2] https://github.com/containerd/containerd/blob/1a078e6893d07fec10a4940a5664fab21d6f7d1e/pkg/cap/cap_linux.go#L135
> [3] https://github.com/moby/moby/commit/485cf38d48e7111b3d1f584d5e9eab46a902aabc#diff-2e04625b209932e74c617de96682ed72fbd1bb0d0cb9fb7c709cf47a86b6f9c1
> moby relies on containerd code.
> [4] https://github.com/syndtr/gocapability/blob/42c35b4376354fd554efc7ad35e0b7f94e3a0ffb/capability/enum.go#L47
> [5] https://github.com/opencontainers/runc/blob/00f56786bb220b55b41748231880ba0e6380519a/libcontainer/capabilities/capabilities.go#L12
> runc relies on syndtr package.
> [6] https://github.com/containers/crun/blob/fafb556f09e6ffd4690c452ff51856b880c089f1/src/libcrun/linux.c#L35

Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 0/2] Add capabilities file to sysfs
  2022-01-20 18:20 ` Eric W. Biederman
@ 2022-01-20 19:01   ` Francis Laniel
  0 siblings, 0 replies; 8+ messages in thread
From: Francis Laniel @ 2022-01-20 19:01 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, linux-security-module, Serge Hallyn, Casey Schaufler

Hi.

Le jeudi 20 janvier 2022, 19:20:15 CET Eric W. Biederman a écrit :
> Francis Laniel <flaniel@linux.microsoft.com> writes:
> > Hi.
> > 
> > 
> > First, I hope you are fine and the same for your relatives.
> > 
> > 
> > Capabilities are used to check if a thread has the right to perform a
> > given
> > action [1].
> > For example, a thread with CAP_BPF set can use the bpf() syscall.
> > 
> > Capabilities are used in the container world.
> > In terms of code, several projects related to container maintain code
> > where the capabilities are written alike include/uapi/linux/capability.h
> > [2][3][4][5]. For these projects, their codebase should be updated when a
> > new capability is added to the kernel.
> > Some other projects rely on <sys/capability.h> [6].
> > In this case, this header file should reflect the capabilities offered by
> > the kernel.
> > 
> > So, in this series, I added a new file to sysfs:
> > /sys/kernel/security/capabilities.
> 
> Actually that is a file in securityfs.  Which is related but slightly
> different.  For sysfs this would be immediately unacceptable as it
> breaks the one value per file rule.  I don't know what the rules
> are for securityfs but I do know files that contain many many lines
> and get very large tend to be problematic in both their kernel
> implementation and in userspace parsing speed.

I was not aware of the sysfs rule, thank you for sharing it to me, I will add 
it to my "kernel knowledge" and will make use of it in the future!

> So I am looking for what the advantage of this file that justifies the
> cost of maintaining it.
> 
> > The goal of this file is to be used by "container world" software to know
> > kernel capabilities at run time instead of compile time.
> 
> I don't understand the problem you are trying to solve.  If the software
> needs to updated what benefit is there for all of the information to be
> available at runtime?

Actually, software like containerd hardcodes all the capabilities the kernel 
knows based on a per-version approach [1].
So if a new capabilities, say CAP_NEW, is added to the kernel, containerd code 
would become something like this:
	// caps59 is the caps of kernel 5.9 (41 entries)
	caps59     = append(caps58, "CAP_CHECKPOINT_RESTORE")
	// caps59 is the caps of kernel 5.17 (42 entries)
	cap517    = append(caps59, "CAP_NEW")
	capsLatest = caps517
This approach has two drawbacks:
1. A user which wants to use CAP_NEW would need to use kernel 5.17 but also a 
version of containerd which knows about CAP_NEW. So, this user would need to 
wait for containerd code to be updated (or to modify it by him/herself and 
compiles it).
2. Containerd maintainers would need to change their code to add CAP_NEW.

It would be easier for everyone if the kernel exposes its capabilities, thus 
containerd code would become something like this:
	caps, err := readCapsFromFile("/sys/kernel/security/capabilities")
	if err {
		caps = capsLatest
	}
This approach addresses the first drawback I quoted above and partly the second 
one:
1. If user kernel was compiled with CONFIG_SECURITYFS, he/she does not need to 
wait the last version of containerd to use CAP_NEW, as containerd would read 
the kernel capabilities from "/sys/kernel/security/capabilities".
2. Containerd maintainers would not need to quickly update their code to 
reflect last kernel change.
I agree they would still need to update their code in case /sys/kernel/
security/capabilities does not exist.

To conclude, this series adds a bit of code in the kernel to make userland 
life easier while not doing nasty things inside the kernel.
What do you think about it?

> > The "file" is read-only and its content is the capability number
> > associated with the capability name:
> > root@vm-amd64:~# cat /sys/kernel/security/capabilities
> > 0       CAP_CHOWN
> > 1       CAP_DAC_OVERRIDE
> > ...
> > 40      CAP_CHECKPOINT_RESTORE
> > 
> > 
> > The kernel already exposes the last capability number under:
> > /proc/sys/kernel/cap_last_cap
> > So, I think there should not be any issue exposing all the capabilities it
> > offers.
> > If there is any, please share it as I do not want to introduce issue with
> > this series.
> 
> The mapping between capabilities and numbers should never change it is
> ABI.  I seem to remember a version number in the file capability so that
> if the mappings do change that number can be changed in a way that
> existing software is not confused.
> 
> What is the advantage in printing all of the mappings?

The printing of all the mappings can be used by containerd code to know about 
the capabilities the kernel knows.
For example, the above mentioned function readCapsFromFile() could be 
implemented like this:
func readCapsFromFile(path string) (string[], err) {
	var capabilities string[];
	// Open the file.
	re := regexp.MustCompile(`(\d+)\t(CAP_\w+)`)
    	for line /*...*/ {
		matches := re.FindStringSubmatch(line)
		// To simplify, I will make the hypothesis the regex matched and 
		// everything is OK.
		// matches[0] contains the whole line expect final '\n'.
		id := strconv.Atoi(matches[1])
		name := matches[2]
		capabilities[id] = name
   	 }
	// Close the file.
	return capabilities, nil
}
It will mainly parse the file output to create the capabilities array.

> 
> > Also, if you see any way to improve this series please share it as it
> > would
> > increase this contribution quality.
> > 
> > Change since v2:
> > * Use a char * for cap_string instead of an array, each line of this char
> > *
> > contains the capability number and its name.
> > * Move the file under /sys/kernel/security instead of /sys/kernel.
> > 
> > Francis Laniel (2):
> >   capability: Add cap_string.
> >   security/inode.c: Add capabilities file.
> >  
> >  include/uapi/linux/capability.h |  1 +
> >  kernel/capability.c             | 45 +++++++++++++++++++++++++++++++++
> >  security/inode.c                | 16 ++++++++++++
> >  3 files changed, 62 insertions(+)
> > 
> > Best regards and thank you in advance for your reviews.
> > ---
> > [1] man capabilities
> > [2]
> > https://github.com/containerd/containerd/blob/1a078e6893d07fec10a4940a566
> > 4fab21d6f7d1e/pkg/cap/cap_linux.go#L135 [3]
> > https://github.com/moby/moby/commit/485cf38d48e7111b3d1f584d5e9eab46a902a
> > abc#diff-2e04625b209932e74c617de96682ed72fbd1bb0d0cb9fb7c709cf47a86b6f9c1
> > moby relies on containerd code.
> > [4]
> > https://github.com/syndtr/gocapability/blob/42c35b4376354fd554efc7ad35e0b
> > 7f94e3a0ffb/capability/enum.go#L47 [5]
> > https://github.com/opencontainers/runc/blob/00f56786bb220b55b41748231880b
> > a0e6380519a/libcontainer/capabilities/capabilities.go#L12 runc relies on
> > syndtr package.
> > [6]
> > https://github.com/containers/crun/blob/fafb556f09e6ffd4690c452ff51856b88
> > 0c089f1/src/libcrun/linux.c#L35
> Eric

Best regards.
---
[1] https://github.com/containerd/containerd/blob/
1a078e6893d07fec10a4940a5664fab21d6f7d1e/pkg/cap/cap_linux.go#L181



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 2/2] security/inode.c: Add capabilities file.
  2022-01-20 18:01 ` [RFC PATCH v3 2/2] security/inode.c: Add capabilities file Francis Laniel
@ 2022-01-21  8:58   ` Francis Laniel
  0 siblings, 0 replies; 8+ messages in thread
From: Francis Laniel @ 2022-01-21  8:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-security-module, Serge Hallyn, Casey Schaufler

Hi.

Le jeudi 20 janvier 2022, 19:01:16 CET Francis Laniel a écrit :
> This new read-only file prints the capabilities values with their names:
> cat /sys/kernel/security/capabilities
> 0       CAP_CHOWN
> 1       CAP_DAC_OVERRIDE
> ...
> 40      CAP_CHECKPOINT_RESTORE
> 
> Signed-off-by: Francis Laniel <flaniel@linux.microsoft.com>
> ---
>  security/inode.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/security/inode.c b/security/inode.c
> index 6c326939750d..cef78b497bab 100644
> --- a/security/inode.c
> +++ b/security/inode.c
> @@ -21,6 +21,7 @@
>  #include <linux/security.h>
>  #include <linux/lsm_hooks.h>
>  #include <linux/magic.h>
> +#include <linux/capability.h>
> 
>  static struct vfsmount *mount;
>  static int mount_count;
> @@ -328,6 +329,19 @@ static const struct file_operations lsm_ops = {
>  };
>  #endif
> 
> +static struct dentry *capabilities_dentry;
> +static ssize_t capabilities_read(struct file *unused, char __user *buf,
> +				 size_t count, loff_t *ppos)
> +{
> +	return simple_read_from_buffer(buf, count, ppos, cap_string,
> +				       strlen(cap_string));
> +}
> +
> +static const struct file_operations capabilities_ops = {
> +	.read = capabilities_read,
> +	.llseek = generic_file_llseek,
> +};
> +
>  static int __init securityfs_init(void)
>  {
>  	int retval;
> @@ -345,6 +359,8 @@ static int __init securityfs_init(void)
>  	lsm_dentry = securityfs_create_file("lsm", 0444, NULL, NULL,
>  						&lsm_ops);
>  #endif
> +	capabilities_dentry = securityfs("capabilities", 0444, NULL, NULL,
> +					 capabilities_ops);

Sorry, I sent the old version of the patch and did not fixup this...
Kernel robot kindly show me this error.
I swear the output in the cover letter was done on the compiled kernel within 
a VM.

I will send a v4 correcting this but I will wait to get some reviews on v3 to 
not send to not generate too much traffic here.

>  	return 0;
>  }
>  core_initcall(securityfs_init);

Best regards.



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-01-21  8:58 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-20 18:01 [RFC PATCH v3 0/2] Add capabilities file to sysfs Francis Laniel
2022-01-20 18:01 ` [RFC PATCH v3 1/2] capability: Add cap_string Francis Laniel
2022-01-20 18:01 ` [RFC PATCH v3 2/2] security/inode.c: Add capabilities file Francis Laniel
2022-01-21  8:58   ` Francis Laniel
2022-01-20 18:09 ` [RFC PATCH v3 0/2] Add capabilities file to sysfs Casey Schaufler
2022-01-20 18:14   ` Francis Laniel
2022-01-20 18:20 ` Eric W. Biederman
2022-01-20 19:01   ` Francis Laniel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).