* [PATCH v4 02/21] fs: Remove check of s_user_ns for existing mounts in fs_fully_visible()
2016-04-26 19:36 ` Seth Forshee
@ 2016-04-26 19:36 ` Seth Forshee
-1 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Alexander Viro
Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee,
dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi,
linux-security-module-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-raid-u79uwXL29TY76Z2rM5mHXA,
fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn,
linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
selinux-+05T5uksL2qpZYMLLGbcSA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov
fs_fully_visible() ignores MNT_LOCK_NODEV when FS_USERS_DEV_MOUNT
is not set for the filesystem, but there is a bug in the logic
that may cause mounting to fail. It is doing this only when the
existing mount is not in init_user_ns but should check the new
mount instead. But the new mount is always in a non-init
namespace when fs_fully_visible() is called, so that condition
can simply be removed.
Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
---
fs/namespace.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/namespace.c b/fs/namespace.c
index f20c82f91ecb..c133318bec35 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3234,8 +3234,7 @@ static bool fs_fully_visible(struct file_system_type *type, int *new_mnt_flags)
mnt_flags = mnt->mnt.mnt_flags;
if (mnt->mnt.mnt_sb->s_iflags & SB_I_NOEXEC)
mnt_flags &= ~(MNT_LOCK_NOSUID | MNT_LOCK_NOEXEC);
- if (mnt->mnt.mnt_sb->s_user_ns != &init_user_ns &&
- !(mnt->mnt.mnt_sb->s_type->fs_flags & FS_USERNS_DEV_MOUNT))
+ if (!(mnt->mnt.mnt_sb->s_type->fs_flags & FS_USERNS_DEV_MOUNT))
mnt_flags &= ~(MNT_LOCK_NODEV);
/* Verify the mount flags are equal to or more permissive
--
2.7.4
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v4 02/21] fs: Remove check of s_user_ns for existing mounts in fs_fully_visible()
@ 2016-04-26 19:36 ` Seth Forshee
0 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Alexander Viro
Cc: Serge Hallyn, Richard Weinberger, Austin S Hemmelgarn,
Miklos Szeredi, Pavel Tikhomirov, linux-kernel, linux-bcache,
dm-devel, linux-raid, linux-mtd, linux-fsdevel, fuse-devel,
linux-security-module, selinux, cgroups, Seth Forshee
fs_fully_visible() ignores MNT_LOCK_NODEV when FS_USERS_DEV_MOUNT
is not set for the filesystem, but there is a bug in the logic
that may cause mounting to fail. It is doing this only when the
existing mount is not in init_user_ns but should check the new
mount instead. But the new mount is always in a non-init
namespace when fs_fully_visible() is called, so that condition
can simply be removed.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
---
fs/namespace.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/namespace.c b/fs/namespace.c
index f20c82f91ecb..c133318bec35 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3234,8 +3234,7 @@ static bool fs_fully_visible(struct file_system_type *type, int *new_mnt_flags)
mnt_flags = mnt->mnt.mnt_flags;
if (mnt->mnt.mnt_sb->s_iflags & SB_I_NOEXEC)
mnt_flags &= ~(MNT_LOCK_NOSUID | MNT_LOCK_NOEXEC);
- if (mnt->mnt.mnt_sb->s_user_ns != &init_user_ns &&
- !(mnt->mnt.mnt_sb->s_type->fs_flags & FS_USERNS_DEV_MOUNT))
+ if (!(mnt->mnt.mnt_sb->s_type->fs_flags & FS_USERNS_DEV_MOUNT))
mnt_flags &= ~(MNT_LOCK_NODEV);
/* Verify the mount flags are equal to or more permissive
--
2.7.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v4 03/21] fs: Allow sysfs and cgroupfs to share super blocks between user namespaces
2016-04-26 19:36 ` Seth Forshee
@ 2016-04-26 19:36 ` Seth Forshee
-1 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Alexander Viro, Greg Kroah-Hartman,
Jeff Layton, J. Bruce Fields, Tejun Heo, Li Zefan,
Johannes Weiner
Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee,
dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi,
linux-security-module-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-raid-u79uwXL29TY76Z2rM5mHXA,
fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn,
linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
selinux-+05T5uksL2qpZYMLLGbcSA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov
Both of these filesystems already have use cases for mounting the
same super block from multiple user namespaces. For sysfs this
happens when using criu for snapshotting a container, where sysfs
is mounted in the containers network ns but the hosts user ns.
The cgroup filesystem shares the same super block for all mounts
of the same hierarchy regardless of the namespace.
As a result, the restriction on mounting a super block from a
single user namespace creates regressions for existing uses of
these filesystems. For these specific filesystems this
restriction isn't really necessary since the backing store is
objects in kernel memory and thus the ids assigned from inodes
is not subject to translation relative to s_user_ns.
Add a new filesystem flag, FS_USERNS_SHARE_SB, which when set
causes sget_userns() to skip the check of s_user_ns. Set this
flag for the sysfs and cgroup filesystems to fix the
regressions.
Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Acked-by: Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>
---
fs/super.c | 3 ++-
fs/sysfs/mount.c | 3 ++-
include/linux/fs.h | 1 +
kernel/cgroup.c | 4 ++--
4 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/fs/super.c b/fs/super.c
index 092a7828442e..ead156b44bf8 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -472,7 +472,8 @@ retry:
hlist_for_each_entry(old, &type->fs_supers, s_instances) {
if (!test(old, data))
continue;
- if (user_ns != old->s_user_ns) {
+ if (!(type->fs_flags & FS_USERNS_SHARE_SB) &&
+ user_ns != old->s_user_ns) {
spin_unlock(&sb_lock);
if (s) {
up_write(&s->s_umount);
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index f3db82071cfb..9555accd4322 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -59,7 +59,8 @@ static struct file_system_type sysfs_fs_type = {
.name = "sysfs",
.mount = sysfs_mount,
.kill_sb = sysfs_kill_sb,
- .fs_flags = FS_USERNS_VISIBLE | FS_USERNS_MOUNT,
+ .fs_flags = FS_USERNS_VISIBLE | FS_USERNS_MOUNT |
+ FS_USERNS_SHARE_SB,
};
int __init sysfs_init(void)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index be0f8023e28c..66a639ec1bc4 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1988,6 +1988,7 @@ struct file_system_type {
#define FS_USERNS_MOUNT 8 /* Can be mounted by userns root */
#define FS_USERNS_DEV_MOUNT 16 /* A userns mount does not imply MNT_NODEV */
#define FS_USERNS_VISIBLE 32 /* FS must already be visible */
+#define FS_USERNS_SHARE_SB 64 /* Allow sharing sb between userns-es */
#define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */
struct dentry *(*mount) (struct file_system_type *, int,
const char *, void *);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 671dc05c0b0f..9c9aa27e531a 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2247,14 +2247,14 @@ static struct file_system_type cgroup_fs_type = {
.name = "cgroup",
.mount = cgroup_mount,
.kill_sb = cgroup_kill_sb,
- .fs_flags = FS_USERNS_MOUNT,
+ .fs_flags = FS_USERNS_MOUNT | FS_USERNS_SHARE_SB,
};
static struct file_system_type cgroup2_fs_type = {
.name = "cgroup2",
.mount = cgroup_mount,
.kill_sb = cgroup_kill_sb,
- .fs_flags = FS_USERNS_MOUNT,
+ .fs_flags = FS_USERNS_MOUNT | FS_USERNS_SHARE_SB,
};
static char *cgroup_path_ns_locked(struct cgroup *cgrp, char *buf, size_t buflen,
--
2.7.4
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v4 03/21] fs: Allow sysfs and cgroupfs to share super blocks between user namespaces
@ 2016-04-26 19:36 ` Seth Forshee
0 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Alexander Viro, Greg Kroah-Hartman,
Jeff Layton, J. Bruce Fields, Tejun Heo, Li Zefan,
Johannes Weiner
Cc: Serge Hallyn, Richard Weinberger, Austin S Hemmelgarn,
Miklos Szeredi, Pavel Tikhomirov, linux-kernel, linux-bcache,
dm-devel, linux-raid, linux-mtd, linux-fsdevel, fuse-devel,
linux-security-module, selinux, cgroups, Seth Forshee
Both of these filesystems already have use cases for mounting the
same super block from multiple user namespaces. For sysfs this
happens when using criu for snapshotting a container, where sysfs
is mounted in the containers network ns but the hosts user ns.
The cgroup filesystem shares the same super block for all mounts
of the same hierarchy regardless of the namespace.
As a result, the restriction on mounting a super block from a
single user namespace creates regressions for existing uses of
these filesystems. For these specific filesystems this
restriction isn't really necessary since the backing store is
objects in kernel memory and thus the ids assigned from inodes
is not subject to translation relative to s_user_ns.
Add a new filesystem flag, FS_USERNS_SHARE_SB, which when set
causes sget_userns() to skip the check of s_user_ns. Set this
flag for the sysfs and cgroup filesystems to fix the
regressions.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
---
fs/super.c | 3 ++-
fs/sysfs/mount.c | 3 ++-
include/linux/fs.h | 1 +
kernel/cgroup.c | 4 ++--
4 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/fs/super.c b/fs/super.c
index 092a7828442e..ead156b44bf8 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -472,7 +472,8 @@ retry:
hlist_for_each_entry(old, &type->fs_supers, s_instances) {
if (!test(old, data))
continue;
- if (user_ns != old->s_user_ns) {
+ if (!(type->fs_flags & FS_USERNS_SHARE_SB) &&
+ user_ns != old->s_user_ns) {
spin_unlock(&sb_lock);
if (s) {
up_write(&s->s_umount);
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index f3db82071cfb..9555accd4322 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -59,7 +59,8 @@ static struct file_system_type sysfs_fs_type = {
.name = "sysfs",
.mount = sysfs_mount,
.kill_sb = sysfs_kill_sb,
- .fs_flags = FS_USERNS_VISIBLE | FS_USERNS_MOUNT,
+ .fs_flags = FS_USERNS_VISIBLE | FS_USERNS_MOUNT |
+ FS_USERNS_SHARE_SB,
};
int __init sysfs_init(void)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index be0f8023e28c..66a639ec1bc4 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1988,6 +1988,7 @@ struct file_system_type {
#define FS_USERNS_MOUNT 8 /* Can be mounted by userns root */
#define FS_USERNS_DEV_MOUNT 16 /* A userns mount does not imply MNT_NODEV */
#define FS_USERNS_VISIBLE 32 /* FS must already be visible */
+#define FS_USERNS_SHARE_SB 64 /* Allow sharing sb between userns-es */
#define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */
struct dentry *(*mount) (struct file_system_type *, int,
const char *, void *);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 671dc05c0b0f..9c9aa27e531a 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2247,14 +2247,14 @@ static struct file_system_type cgroup_fs_type = {
.name = "cgroup",
.mount = cgroup_mount,
.kill_sb = cgroup_kill_sb,
- .fs_flags = FS_USERNS_MOUNT,
+ .fs_flags = FS_USERNS_MOUNT | FS_USERNS_SHARE_SB,
};
static struct file_system_type cgroup2_fs_type = {
.name = "cgroup2",
.mount = cgroup_mount,
.kill_sb = cgroup_kill_sb,
- .fs_flags = FS_USERNS_MOUNT,
+ .fs_flags = FS_USERNS_MOUNT | FS_USERNS_SHARE_SB,
};
static char *cgroup_path_ns_locked(struct cgroup *cgrp, char *buf, size_t buflen,
--
2.7.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v4 07/21] selinux: Add support for unprivileged mounts from user namespaces
2016-04-26 19:36 ` Seth Forshee
@ 2016-04-26 19:36 ` Seth Forshee
-1 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Paul Moore, Stephen Smalley, Eric Paris
Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee,
James Morris, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi,
linux-security-module-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-raid-u79uwXL29TY76Z2rM5mHXA,
fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn,
linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Alexander Viro,
selinux-+05T5uksL2qpZYMLLGbcSA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov,
Serge E. Hallyn
Security labels from unprivileged mounts in user namespaces must
be ignored. Force superblocks from user namespaces whose labeling
behavior is to use xattrs to use mountpoint labeling instead.
For the mountpoint label, default to converting the current task
context into a form suitable for file objects, but also allow the
policy writer to specify a different label through policy
transition rules.
Pieced together from code snippets provided by Stephen Smalley.
Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Acked-by: Stephen Smalley <sds-+05T5uksL2qpZYMLLGbcSA@public.gmane.org>
Acked-by: James Morris <james.l.morris-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---
security/selinux/hooks.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 1350167635cb..33beed3ac589 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -820,6 +820,28 @@ static int selinux_set_mnt_opts(struct super_block *sb,
goto out;
}
}
+
+ /*
+ * If this is a user namespace mount, no contexts are allowed
+ * on the command line and security labels must be ignored.
+ */
+ if (sb->s_user_ns != &init_user_ns) {
+ if (context_sid || fscontext_sid || rootcontext_sid ||
+ defcontext_sid) {
+ rc = -EACCES;
+ goto out;
+ }
+ if (sbsec->behavior == SECURITY_FS_USE_XATTR) {
+ sbsec->behavior = SECURITY_FS_USE_MNTPOINT;
+ rc = security_transition_sid(current_sid(), current_sid(),
+ SECCLASS_FILE, NULL,
+ &sbsec->mntpoint_sid);
+ if (rc)
+ goto out;
+ }
+ goto out_set_opts;
+ }
+
/* sets the context of the superblock for the fs being mounted. */
if (fscontext_sid) {
rc = may_context_mount_sb_relabel(fscontext_sid, sbsec, cred);
@@ -888,6 +910,7 @@ static int selinux_set_mnt_opts(struct super_block *sb,
sbsec->def_sid = defcontext_sid;
}
+out_set_opts:
rc = sb_finish_set_opts(sb);
out:
mutex_unlock(&sbsec->lock);
--
2.7.4
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v4 07/21] selinux: Add support for unprivileged mounts from user namespaces
@ 2016-04-26 19:36 ` Seth Forshee
0 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Paul Moore, Stephen Smalley, Eric Paris
Cc: Alexander Viro, Serge Hallyn, Richard Weinberger,
Austin S Hemmelgarn, Miklos Szeredi, Pavel Tikhomirov,
linux-kernel, linux-bcache, dm-devel, linux-raid, linux-mtd,
linux-fsdevel, fuse-devel, linux-security-module, selinux,
cgroups, Seth Forshee, James Morris, Serge E. Hallyn
Security labels from unprivileged mounts in user namespaces must
be ignored. Force superblocks from user namespaces whose labeling
behavior is to use xattrs to use mountpoint labeling instead.
For the mountpoint label, default to converting the current task
context into a form suitable for file objects, but also allow the
policy writer to specify a different label through policy
transition rules.
Pieced together from code snippets provided by Stephen Smalley.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <james.l.morris@oracle.com>
---
security/selinux/hooks.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 1350167635cb..33beed3ac589 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -820,6 +820,28 @@ static int selinux_set_mnt_opts(struct super_block *sb,
goto out;
}
}
+
+ /*
+ * If this is a user namespace mount, no contexts are allowed
+ * on the command line and security labels must be ignored.
+ */
+ if (sb->s_user_ns != &init_user_ns) {
+ if (context_sid || fscontext_sid || rootcontext_sid ||
+ defcontext_sid) {
+ rc = -EACCES;
+ goto out;
+ }
+ if (sbsec->behavior == SECURITY_FS_USE_XATTR) {
+ sbsec->behavior = SECURITY_FS_USE_MNTPOINT;
+ rc = security_transition_sid(current_sid(), current_sid(),
+ SECCLASS_FILE, NULL,
+ &sbsec->mntpoint_sid);
+ if (rc)
+ goto out;
+ }
+ goto out_set_opts;
+ }
+
/* sets the context of the superblock for the fs being mounted. */
if (fscontext_sid) {
rc = may_context_mount_sb_relabel(fscontext_sid, sbsec, cred);
@@ -888,6 +910,7 @@ static int selinux_set_mnt_opts(struct super_block *sb,
sbsec->def_sid = defcontext_sid;
}
+out_set_opts:
rc = sb_finish_set_opts(sb);
out:
mutex_unlock(&sbsec->lock);
--
2.7.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v4 12/21] fs: Refuse uid/gid changes which don't map into s_user_ns
2016-04-26 19:36 ` Seth Forshee
@ 2016-04-26 19:36 ` Seth Forshee
-1 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Alexander Viro
Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee,
dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi,
linux-security-module-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-raid-u79uwXL29TY76Z2rM5mHXA,
fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn,
linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
selinux-+05T5uksL2qpZYMLLGbcSA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov
Add checks to inode_change_ok to verify that uid and gid changes
will map into the superblock's user namespace. If they do not
fail with -EOVERFLOW. This cannot be overriden with ATTR_FORCE.
Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
---
fs/attr.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/fs/attr.c b/fs/attr.c
index 25b24d0f6c88..3cfaaac4a18e 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -42,6 +42,17 @@ int inode_change_ok(const struct inode *inode, struct iattr *attr)
return error;
}
+ /*
+ * Verify that uid/gid changes are valid in the target namespace
+ * of the superblock. This cannot be overriden using ATTR_FORCE.
+ */
+ if (ia_valid & ATTR_UID &&
+ from_kuid(inode->i_sb->s_user_ns, attr->ia_uid) == (uid_t)-1)
+ return -EOVERFLOW;
+ if (ia_valid & ATTR_GID &&
+ from_kgid(inode->i_sb->s_user_ns, attr->ia_gid) == (gid_t)-1)
+ return -EOVERFLOW;
+
/* If force is set do it anyway. */
if (ia_valid & ATTR_FORCE)
return 0;
--
2.7.4
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v4 12/21] fs: Refuse uid/gid changes which don't map into s_user_ns
@ 2016-04-26 19:36 ` Seth Forshee
0 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Alexander Viro
Cc: Serge Hallyn, Richard Weinberger, Austin S Hemmelgarn,
Miklos Szeredi, Pavel Tikhomirov, linux-kernel, linux-bcache,
dm-devel, linux-raid, linux-mtd, linux-fsdevel, fuse-devel,
linux-security-module, selinux, cgroups, Seth Forshee
Add checks to inode_change_ok to verify that uid and gid changes
will map into the superblock's user namespace. If they do not
fail with -EOVERFLOW. This cannot be overriden with ATTR_FORCE.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
---
fs/attr.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/fs/attr.c b/fs/attr.c
index 25b24d0f6c88..3cfaaac4a18e 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -42,6 +42,17 @@ int inode_change_ok(const struct inode *inode, struct iattr *attr)
return error;
}
+ /*
+ * Verify that uid/gid changes are valid in the target namespace
+ * of the superblock. This cannot be overriden using ATTR_FORCE.
+ */
+ if (ia_valid & ATTR_UID &&
+ from_kuid(inode->i_sb->s_user_ns, attr->ia_uid) == (uid_t)-1)
+ return -EOVERFLOW;
+ if (ia_valid & ATTR_GID &&
+ from_kgid(inode->i_sb->s_user_ns, attr->ia_gid) == (gid_t)-1)
+ return -EOVERFLOW;
+
/* If force is set do it anyway. */
if (ia_valid & ATTR_FORCE)
return 0;
--
2.7.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v4 13/21] fs: Update posix_acl support to handle user namespace mounts
2016-04-26 19:36 ` Seth Forshee
@ 2016-04-26 19:36 ` Seth Forshee
-1 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Alexander Viro
Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee,
dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi,
linux-security-module-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-raid-u79uwXL29TY76Z2rM5mHXA,
fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn,
linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
selinux-+05T5uksL2qpZYMLLGbcSA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov
ids in on-disk ACLs should be converted to s_user_ns instead of
init_user_ns as is done now. This introduces the possibility for
id mappings to fail, and when this happens syscalls will return
EOVERFLOW.
Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
---
fs/posix_acl.c | 67 ++++++++++++++++++++++++++---------------
fs/xattr.c | 19 +++++++++---
include/linux/posix_acl_xattr.h | 17 ++++++++---
3 files changed, 70 insertions(+), 33 deletions(-)
diff --git a/fs/posix_acl.c b/fs/posix_acl.c
index 711dd5170376..dac2842dd4cb 100644
--- a/fs/posix_acl.c
+++ b/fs/posix_acl.c
@@ -595,59 +595,77 @@ EXPORT_SYMBOL_GPL(posix_acl_create);
/*
* Fix up the uids and gids in posix acl extended attributes in place.
*/
-static void posix_acl_fix_xattr_userns(
+static int posix_acl_fix_xattr_userns(
struct user_namespace *to, struct user_namespace *from,
void *value, size_t size)
{
posix_acl_xattr_header *header = (posix_acl_xattr_header *)value;
posix_acl_xattr_entry *entry = (posix_acl_xattr_entry *)(header+1), *end;
int count;
- kuid_t uid;
- kgid_t gid;
+ kuid_t kuid;
+ uid_t uid;
+ kgid_t kgid;
+ gid_t gid;
if (!value)
- return;
+ return 0;
if (size < sizeof(posix_acl_xattr_header))
- return;
+ return 0;
if (header->a_version != cpu_to_le32(POSIX_ACL_XATTR_VERSION))
- return;
+ return 0;
count = posix_acl_xattr_count(size);
if (count < 0)
- return;
+ return 0;
if (count == 0)
- return;
+ return 0;
for (end = entry + count; entry != end; entry++) {
switch(le16_to_cpu(entry->e_tag)) {
case ACL_USER:
- uid = make_kuid(from, le32_to_cpu(entry->e_id));
- entry->e_id = cpu_to_le32(from_kuid(to, uid));
+ kuid = make_kuid(from, le32_to_cpu(entry->e_id));
+ if (!uid_valid(kuid))
+ return -EOVERFLOW;
+ uid = from_kuid(to, kuid);
+ if (uid == (uid_t)-1)
+ return -EOVERFLOW;
+ entry->e_id = cpu_to_le32(uid);
break;
case ACL_GROUP:
- gid = make_kgid(from, le32_to_cpu(entry->e_id));
- entry->e_id = cpu_to_le32(from_kgid(to, gid));
+ kgid = make_kgid(from, le32_to_cpu(entry->e_id));
+ if (!gid_valid(kgid))
+ return -EOVERFLOW;
+ gid = from_kgid(to, kgid);
+ if (gid == (gid_t)-1)
+ return -EOVERFLOW;
+ entry->e_id = cpu_to_le32(gid);
break;
default:
break;
}
}
+
+ return 0;
}
-void posix_acl_fix_xattr_from_user(void *value, size_t size)
+int
+posix_acl_fix_xattr_from_user(struct user_namespace *target_ns, void *value,
+ size_t size)
{
- struct user_namespace *user_ns = current_user_ns();
- if (user_ns == &init_user_ns)
- return;
- posix_acl_fix_xattr_userns(&init_user_ns, user_ns, value, size);
+ struct user_namespace *source_ns = current_user_ns();
+ if (source_ns == target_ns)
+ return 0;
+ return posix_acl_fix_xattr_userns(target_ns, source_ns, value, size);
}
-void posix_acl_fix_xattr_to_user(void *value, size_t size)
+int
+posix_acl_fix_xattr_to_user(struct user_namespace *source_ns, void *value,
+ size_t size)
{
- struct user_namespace *user_ns = current_user_ns();
- if (user_ns == &init_user_ns)
- return;
- posix_acl_fix_xattr_userns(user_ns, &init_user_ns, value, size);
+ struct user_namespace *target_ns = current_user_ns();
+ if (target_ns == source_ns)
+ return 0;
+ return posix_acl_fix_xattr_userns(target_ns, source_ns, value, size);
}
/*
@@ -780,7 +798,7 @@ posix_acl_xattr_get(const struct xattr_handler *handler,
if (acl == NULL)
return -ENODATA;
- error = posix_acl_to_xattr(&init_user_ns, acl, value, size);
+ error = posix_acl_to_xattr(dentry->d_sb->s_user_ns, acl, value, size);
posix_acl_release(acl);
return error;
@@ -806,7 +824,8 @@ posix_acl_xattr_set(const struct xattr_handler *handler,
return -EPERM;
if (value) {
- acl = posix_acl_from_xattr(&init_user_ns, value, size);
+ acl = posix_acl_from_xattr(dentry->d_sb->s_user_ns, value,
+ size);
if (IS_ERR(acl))
return PTR_ERR(acl);
diff --git a/fs/xattr.c b/fs/xattr.c
index 4861322e28e8..c541121945cd 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -330,8 +330,12 @@ setxattr(struct dentry *d, const char __user *name, const void __user *value,
goto out;
}
if ((strcmp(kname, XATTR_NAME_POSIX_ACL_ACCESS) == 0) ||
- (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0))
- posix_acl_fix_xattr_from_user(kvalue, size);
+ (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0)) {
+ error = posix_acl_fix_xattr_from_user(d->d_sb->s_user_ns,
+ kvalue, size);
+ if (error)
+ goto out;
+ }
}
error = vfs_setxattr(d, kname, kvalue, size, flags);
@@ -427,9 +431,14 @@ getxattr(struct dentry *d, const char __user *name, void __user *value,
error = vfs_getxattr(d, kname, kvalue, size);
if (error > 0) {
if ((strcmp(kname, XATTR_NAME_POSIX_ACL_ACCESS) == 0) ||
- (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0))
- posix_acl_fix_xattr_to_user(kvalue, size);
- if (size && copy_to_user(value, kvalue, error))
+ (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0)) {
+ int ret;
+ ret = posix_acl_fix_xattr_to_user(d->d_sb->s_user_ns,
+ kvalue, size);
+ if (ret)
+ error = ret;
+ }
+ if (error > 0 && size && copy_to_user(value, kvalue, error))
error = -EFAULT;
} else if (error == -ERANGE && size >= XATTR_SIZE_MAX) {
/* The file system tried to returned a value bigger
diff --git a/include/linux/posix_acl_xattr.h b/include/linux/posix_acl_xattr.h
index e5e8ec40278d..5dec6b10951a 100644
--- a/include/linux/posix_acl_xattr.h
+++ b/include/linux/posix_acl_xattr.h
@@ -49,14 +49,23 @@ posix_acl_xattr_count(size_t size)
}
#ifdef CONFIG_FS_POSIX_ACL
-void posix_acl_fix_xattr_from_user(void *value, size_t size);
-void posix_acl_fix_xattr_to_user(void *value, size_t size);
+int posix_acl_fix_xattr_from_user(struct user_namespace *target_ns,
+ void *value, size_t size);
+int posix_acl_fix_xattr_to_user(struct user_namespace *source_ns, void *value,
+ size_t size);
#else
-static inline void posix_acl_fix_xattr_from_user(void *value, size_t size)
+static inline int
+posix_acl_fix_xattr_from_user(struct user_namespace *target_ns, void *value,
+ size_t size)
{
+ return 0;
}
-static inline void posix_acl_fix_xattr_to_user(void *value, size_t size)
+
+static inline int
+posix_acl_fix_xattr_to_user(struct user_namespace *source_ns, void *value,
+ size_t size)
{
+ return 0;
}
#endif
--
2.7.4
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v4 13/21] fs: Update posix_acl support to handle user namespace mounts
@ 2016-04-26 19:36 ` Seth Forshee
0 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Alexander Viro
Cc: Serge Hallyn, Richard Weinberger, Austin S Hemmelgarn,
Miklos Szeredi, Pavel Tikhomirov, linux-kernel, linux-bcache,
dm-devel, linux-raid, linux-mtd, linux-fsdevel, fuse-devel,
linux-security-module, selinux, cgroups, Seth Forshee
ids in on-disk ACLs should be converted to s_user_ns instead of
init_user_ns as is done now. This introduces the possibility for
id mappings to fail, and when this happens syscalls will return
EOVERFLOW.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
---
fs/posix_acl.c | 67 ++++++++++++++++++++++++++---------------
fs/xattr.c | 19 +++++++++---
include/linux/posix_acl_xattr.h | 17 ++++++++---
3 files changed, 70 insertions(+), 33 deletions(-)
diff --git a/fs/posix_acl.c b/fs/posix_acl.c
index 711dd5170376..dac2842dd4cb 100644
--- a/fs/posix_acl.c
+++ b/fs/posix_acl.c
@@ -595,59 +595,77 @@ EXPORT_SYMBOL_GPL(posix_acl_create);
/*
* Fix up the uids and gids in posix acl extended attributes in place.
*/
-static void posix_acl_fix_xattr_userns(
+static int posix_acl_fix_xattr_userns(
struct user_namespace *to, struct user_namespace *from,
void *value, size_t size)
{
posix_acl_xattr_header *header = (posix_acl_xattr_header *)value;
posix_acl_xattr_entry *entry = (posix_acl_xattr_entry *)(header+1), *end;
int count;
- kuid_t uid;
- kgid_t gid;
+ kuid_t kuid;
+ uid_t uid;
+ kgid_t kgid;
+ gid_t gid;
if (!value)
- return;
+ return 0;
if (size < sizeof(posix_acl_xattr_header))
- return;
+ return 0;
if (header->a_version != cpu_to_le32(POSIX_ACL_XATTR_VERSION))
- return;
+ return 0;
count = posix_acl_xattr_count(size);
if (count < 0)
- return;
+ return 0;
if (count == 0)
- return;
+ return 0;
for (end = entry + count; entry != end; entry++) {
switch(le16_to_cpu(entry->e_tag)) {
case ACL_USER:
- uid = make_kuid(from, le32_to_cpu(entry->e_id));
- entry->e_id = cpu_to_le32(from_kuid(to, uid));
+ kuid = make_kuid(from, le32_to_cpu(entry->e_id));
+ if (!uid_valid(kuid))
+ return -EOVERFLOW;
+ uid = from_kuid(to, kuid);
+ if (uid == (uid_t)-1)
+ return -EOVERFLOW;
+ entry->e_id = cpu_to_le32(uid);
break;
case ACL_GROUP:
- gid = make_kgid(from, le32_to_cpu(entry->e_id));
- entry->e_id = cpu_to_le32(from_kgid(to, gid));
+ kgid = make_kgid(from, le32_to_cpu(entry->e_id));
+ if (!gid_valid(kgid))
+ return -EOVERFLOW;
+ gid = from_kgid(to, kgid);
+ if (gid == (gid_t)-1)
+ return -EOVERFLOW;
+ entry->e_id = cpu_to_le32(gid);
break;
default:
break;
}
}
+
+ return 0;
}
-void posix_acl_fix_xattr_from_user(void *value, size_t size)
+int
+posix_acl_fix_xattr_from_user(struct user_namespace *target_ns, void *value,
+ size_t size)
{
- struct user_namespace *user_ns = current_user_ns();
- if (user_ns == &init_user_ns)
- return;
- posix_acl_fix_xattr_userns(&init_user_ns, user_ns, value, size);
+ struct user_namespace *source_ns = current_user_ns();
+ if (source_ns == target_ns)
+ return 0;
+ return posix_acl_fix_xattr_userns(target_ns, source_ns, value, size);
}
-void posix_acl_fix_xattr_to_user(void *value, size_t size)
+int
+posix_acl_fix_xattr_to_user(struct user_namespace *source_ns, void *value,
+ size_t size)
{
- struct user_namespace *user_ns = current_user_ns();
- if (user_ns == &init_user_ns)
- return;
- posix_acl_fix_xattr_userns(user_ns, &init_user_ns, value, size);
+ struct user_namespace *target_ns = current_user_ns();
+ if (target_ns == source_ns)
+ return 0;
+ return posix_acl_fix_xattr_userns(target_ns, source_ns, value, size);
}
/*
@@ -780,7 +798,7 @@ posix_acl_xattr_get(const struct xattr_handler *handler,
if (acl == NULL)
return -ENODATA;
- error = posix_acl_to_xattr(&init_user_ns, acl, value, size);
+ error = posix_acl_to_xattr(dentry->d_sb->s_user_ns, acl, value, size);
posix_acl_release(acl);
return error;
@@ -806,7 +824,8 @@ posix_acl_xattr_set(const struct xattr_handler *handler,
return -EPERM;
if (value) {
- acl = posix_acl_from_xattr(&init_user_ns, value, size);
+ acl = posix_acl_from_xattr(dentry->d_sb->s_user_ns, value,
+ size);
if (IS_ERR(acl))
return PTR_ERR(acl);
diff --git a/fs/xattr.c b/fs/xattr.c
index 4861322e28e8..c541121945cd 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -330,8 +330,12 @@ setxattr(struct dentry *d, const char __user *name, const void __user *value,
goto out;
}
if ((strcmp(kname, XATTR_NAME_POSIX_ACL_ACCESS) == 0) ||
- (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0))
- posix_acl_fix_xattr_from_user(kvalue, size);
+ (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0)) {
+ error = posix_acl_fix_xattr_from_user(d->d_sb->s_user_ns,
+ kvalue, size);
+ if (error)
+ goto out;
+ }
}
error = vfs_setxattr(d, kname, kvalue, size, flags);
@@ -427,9 +431,14 @@ getxattr(struct dentry *d, const char __user *name, void __user *value,
error = vfs_getxattr(d, kname, kvalue, size);
if (error > 0) {
if ((strcmp(kname, XATTR_NAME_POSIX_ACL_ACCESS) == 0) ||
- (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0))
- posix_acl_fix_xattr_to_user(kvalue, size);
- if (size && copy_to_user(value, kvalue, error))
+ (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0)) {
+ int ret;
+ ret = posix_acl_fix_xattr_to_user(d->d_sb->s_user_ns,
+ kvalue, size);
+ if (ret)
+ error = ret;
+ }
+ if (error > 0 && size && copy_to_user(value, kvalue, error))
error = -EFAULT;
} else if (error == -ERANGE && size >= XATTR_SIZE_MAX) {
/* The file system tried to returned a value bigger
diff --git a/include/linux/posix_acl_xattr.h b/include/linux/posix_acl_xattr.h
index e5e8ec40278d..5dec6b10951a 100644
--- a/include/linux/posix_acl_xattr.h
+++ b/include/linux/posix_acl_xattr.h
@@ -49,14 +49,23 @@ posix_acl_xattr_count(size_t size)
}
#ifdef CONFIG_FS_POSIX_ACL
-void posix_acl_fix_xattr_from_user(void *value, size_t size);
-void posix_acl_fix_xattr_to_user(void *value, size_t size);
+int posix_acl_fix_xattr_from_user(struct user_namespace *target_ns,
+ void *value, size_t size);
+int posix_acl_fix_xattr_to_user(struct user_namespace *source_ns, void *value,
+ size_t size);
#else
-static inline void posix_acl_fix_xattr_from_user(void *value, size_t size)
+static inline int
+posix_acl_fix_xattr_from_user(struct user_namespace *target_ns, void *value,
+ size_t size)
{
+ return 0;
}
-static inline void posix_acl_fix_xattr_to_user(void *value, size_t size)
+
+static inline int
+posix_acl_fix_xattr_to_user(struct user_namespace *source_ns, void *value,
+ size_t size)
{
+ return 0;
}
#endif
--
2.7.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v4 14/21] fs: Allow superblock owner to change ownership of inodes with unmappable ids
2016-04-26 19:36 ` Seth Forshee
@ 2016-04-26 19:36 ` Seth Forshee
-1 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Alexander Viro, Greg Kroah-Hartman
Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee,
dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi,
linux-security-module-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-raid-u79uwXL29TY76Z2rM5mHXA,
fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn,
linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
selinux-+05T5uksL2qpZYMLLGbcSA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov
In a userns mount some on-disk inodes may have ids which do not
map into s_user_ns, in which case the in-kernel inodes are owned
by invalid users. The superblock owner should be able to change
attributes of these inodes but cannot. However it is unsafe to
grant the superblock owner privileged access to all inodes in the
superblock since proc, sysfs, etc. use DAC to protect files which
may not belong to s_user_ns. The problem is restricted to only
inodes where the owner or group is an invalid user.
We can work around this by allowing users with CAP_CHOWN in
s_user_ns to change an invalid owner or group id, so long as the
other id is either invalid or mappable in s_user_ns. After
changing ownership the user will be privileged towards the inode
and thus able to change other attributes.
As an precaution, checks for invalid ids are added to the proc
and kernfs setattr interfaces. These filesystems are not expected
to have inodes with invalid ids, but if it does happen any
setattr operations will return -EPERM.
Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
---
fs/attr.c | 62 ++++++++++++++++++++++++++++++++++++++++++++-------
fs/kernfs/inode.c | 2 ++
fs/proc/base.c | 2 ++
fs/proc/generic.c | 3 +++
fs/proc/proc_sysctl.c | 2 ++
5 files changed, 63 insertions(+), 8 deletions(-)
diff --git a/fs/attr.c b/fs/attr.c
index 3cfaaac4a18e..06bb3f401559 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -16,6 +16,58 @@
#include <linux/evm.h>
#include <linux/ima.h>
+static bool chown_ok(const struct inode *inode, kuid_t uid)
+{
+ struct user_namespace *user_ns;
+
+ if (uid_eq(current_fsuid(), inode->i_uid) && uid_eq(uid, inode->i_uid))
+ return true;
+ if (capable_wrt_inode_uidgid(inode, CAP_CHOWN))
+ return true;
+
+ /*
+ * Inode uids/gids are of type kuid_t/kgid_t. As such, they can be
+ * a) INVALID_UID/INVALID_GID, b) valid and mappable into
+ * i_sb->s_user_ns, or c) valid but not mappable into
+ * i_sb->s_user_ns.
+ *
+ * For filesystems on user-supplied media ids will either be (a) or
+ * (b), so we permit CAP_CHOWN in s_user_ns to change INVALID_UID if
+ * the gid meets these conditions (and vice versa for INVALID_GID).
+ *
+ * For psuedo-filesystems like proc or sysfs ids will be either (b)
+ * or (c), so these conditions do not permit namespace-root to chown
+ * in those filesystems.
+ */
+ user_ns = inode->i_sb->s_user_ns;
+ if (!uid_valid(inode->i_uid) &&
+ (!gid_valid(inode->i_gid) || kgid_has_mapping(user_ns, inode->i_gid)) &&
+ ns_capable(user_ns, CAP_CHOWN))
+ return true;
+
+ return false;
+}
+
+static bool chgrp_ok(const struct inode *inode, kgid_t gid)
+{
+ struct user_namespace *user_ns;
+
+ if (uid_eq(current_fsuid(), inode->i_uid) &&
+ (in_group_p(gid) || gid_eq(gid, inode->i_gid)))
+ return true;
+ if (capable_wrt_inode_uidgid(inode, CAP_CHOWN))
+ return true;
+
+ /* Logic here is the same as in chown_ok(); see comment there. */
+ user_ns = inode->i_sb->s_user_ns;
+ if (!gid_valid(inode->i_gid) &&
+ (!uid_valid(inode->i_uid) || kuid_has_mapping(user_ns, inode->i_uid)) &&
+ ns_capable(user_ns, CAP_CHOWN))
+ return true;
+
+ return false;
+}
+
/**
* inode_change_ok - check if attribute changes to an inode are allowed
* @inode: inode to check
@@ -58,17 +110,11 @@ int inode_change_ok(const struct inode *inode, struct iattr *attr)
return 0;
/* Make sure a caller can chown. */
- if ((ia_valid & ATTR_UID) &&
- (!uid_eq(current_fsuid(), inode->i_uid) ||
- !uid_eq(attr->ia_uid, inode->i_uid)) &&
- !capable_wrt_inode_uidgid(inode, CAP_CHOWN))
+ if ((ia_valid & ATTR_UID) && !chown_ok(inode, attr->ia_uid))
return -EPERM;
/* Make sure caller can chgrp. */
- if ((ia_valid & ATTR_GID) &&
- (!uid_eq(current_fsuid(), inode->i_uid) ||
- (!in_group_p(attr->ia_gid) && !gid_eq(attr->ia_gid, inode->i_gid))) &&
- !capable_wrt_inode_uidgid(inode, CAP_CHOWN))
+ if ((ia_valid & ATTR_GID) && !chgrp_ok(inode, attr->ia_gid))
return -EPERM;
/* Make sure a caller can chmod. */
diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
index 16405ae88d2d..2e97a337ee5f 100644
--- a/fs/kernfs/inode.c
+++ b/fs/kernfs/inode.c
@@ -117,6 +117,8 @@ int kernfs_iop_setattr(struct dentry *dentry, struct iattr *iattr)
if (!kn)
return -EINVAL;
+ if (!uid_valid(inode->i_uid) || !gid_valid(inode->i_gid))
+ return -EPERM;
mutex_lock(&kernfs_mutex);
error = inode_change_ok(inode, iattr);
diff --git a/fs/proc/base.c b/fs/proc/base.c
index b1755b23893e..648d623e2158 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -711,6 +711,8 @@ int proc_setattr(struct dentry *dentry, struct iattr *attr)
if (attr->ia_valid & ATTR_MODE)
return -EPERM;
+ if (!uid_valid(inode->i_uid) || !gid_valid(inode->i_gid))
+ return -EPERM;
error = inode_change_ok(inode, attr);
if (error)
diff --git a/fs/proc/generic.c b/fs/proc/generic.c
index ff3ffc76a937..1461570c552c 100644
--- a/fs/proc/generic.c
+++ b/fs/proc/generic.c
@@ -105,6 +105,9 @@ static int proc_notify_change(struct dentry *dentry, struct iattr *iattr)
struct proc_dir_entry *de = PDE(inode);
int error;
+ if (!uid_valid(inode->i_uid) || !gid_valid(inode->i_gid))
+ return -EPERM;
+
error = inode_change_ok(inode, iattr);
if (error)
return error;
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index fe5b6e6c4671..f5d575157194 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -752,6 +752,8 @@ static int proc_sys_setattr(struct dentry *dentry, struct iattr *attr)
if (attr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID))
return -EPERM;
+ if (!uid_valid(inode->i_uid) || !gid_valid(inode->i_gid))
+ return -EPERM;
error = inode_change_ok(inode, attr);
if (error)
--
2.7.4
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v4 14/21] fs: Allow superblock owner to change ownership of inodes with unmappable ids
@ 2016-04-26 19:36 ` Seth Forshee
0 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Alexander Viro, Greg Kroah-Hartman
Cc: Serge Hallyn, Richard Weinberger, Austin S Hemmelgarn,
Miklos Szeredi, Pavel Tikhomirov, linux-kernel, linux-bcache,
dm-devel, linux-raid, linux-mtd, linux-fsdevel, fuse-devel,
linux-security-module, selinux, cgroups, Seth Forshee
In a userns mount some on-disk inodes may have ids which do not
map into s_user_ns, in which case the in-kernel inodes are owned
by invalid users. The superblock owner should be able to change
attributes of these inodes but cannot. However it is unsafe to
grant the superblock owner privileged access to all inodes in the
superblock since proc, sysfs, etc. use DAC to protect files which
may not belong to s_user_ns. The problem is restricted to only
inodes where the owner or group is an invalid user.
We can work around this by allowing users with CAP_CHOWN in
s_user_ns to change an invalid owner or group id, so long as the
other id is either invalid or mappable in s_user_ns. After
changing ownership the user will be privileged towards the inode
and thus able to change other attributes.
As an precaution, checks for invalid ids are added to the proc
and kernfs setattr interfaces. These filesystems are not expected
to have inodes with invalid ids, but if it does happen any
setattr operations will return -EPERM.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
---
fs/attr.c | 62 ++++++++++++++++++++++++++++++++++++++++++++-------
fs/kernfs/inode.c | 2 ++
fs/proc/base.c | 2 ++
fs/proc/generic.c | 3 +++
fs/proc/proc_sysctl.c | 2 ++
5 files changed, 63 insertions(+), 8 deletions(-)
diff --git a/fs/attr.c b/fs/attr.c
index 3cfaaac4a18e..06bb3f401559 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -16,6 +16,58 @@
#include <linux/evm.h>
#include <linux/ima.h>
+static bool chown_ok(const struct inode *inode, kuid_t uid)
+{
+ struct user_namespace *user_ns;
+
+ if (uid_eq(current_fsuid(), inode->i_uid) && uid_eq(uid, inode->i_uid))
+ return true;
+ if (capable_wrt_inode_uidgid(inode, CAP_CHOWN))
+ return true;
+
+ /*
+ * Inode uids/gids are of type kuid_t/kgid_t. As such, they can be
+ * a) INVALID_UID/INVALID_GID, b) valid and mappable into
+ * i_sb->s_user_ns, or c) valid but not mappable into
+ * i_sb->s_user_ns.
+ *
+ * For filesystems on user-supplied media ids will either be (a) or
+ * (b), so we permit CAP_CHOWN in s_user_ns to change INVALID_UID if
+ * the gid meets these conditions (and vice versa for INVALID_GID).
+ *
+ * For psuedo-filesystems like proc or sysfs ids will be either (b)
+ * or (c), so these conditions do not permit namespace-root to chown
+ * in those filesystems.
+ */
+ user_ns = inode->i_sb->s_user_ns;
+ if (!uid_valid(inode->i_uid) &&
+ (!gid_valid(inode->i_gid) || kgid_has_mapping(user_ns, inode->i_gid)) &&
+ ns_capable(user_ns, CAP_CHOWN))
+ return true;
+
+ return false;
+}
+
+static bool chgrp_ok(const struct inode *inode, kgid_t gid)
+{
+ struct user_namespace *user_ns;
+
+ if (uid_eq(current_fsuid(), inode->i_uid) &&
+ (in_group_p(gid) || gid_eq(gid, inode->i_gid)))
+ return true;
+ if (capable_wrt_inode_uidgid(inode, CAP_CHOWN))
+ return true;
+
+ /* Logic here is the same as in chown_ok(); see comment there. */
+ user_ns = inode->i_sb->s_user_ns;
+ if (!gid_valid(inode->i_gid) &&
+ (!uid_valid(inode->i_uid) || kuid_has_mapping(user_ns, inode->i_uid)) &&
+ ns_capable(user_ns, CAP_CHOWN))
+ return true;
+
+ return false;
+}
+
/**
* inode_change_ok - check if attribute changes to an inode are allowed
* @inode: inode to check
@@ -58,17 +110,11 @@ int inode_change_ok(const struct inode *inode, struct iattr *attr)
return 0;
/* Make sure a caller can chown. */
- if ((ia_valid & ATTR_UID) &&
- (!uid_eq(current_fsuid(), inode->i_uid) ||
- !uid_eq(attr->ia_uid, inode->i_uid)) &&
- !capable_wrt_inode_uidgid(inode, CAP_CHOWN))
+ if ((ia_valid & ATTR_UID) && !chown_ok(inode, attr->ia_uid))
return -EPERM;
/* Make sure caller can chgrp. */
- if ((ia_valid & ATTR_GID) &&
- (!uid_eq(current_fsuid(), inode->i_uid) ||
- (!in_group_p(attr->ia_gid) && !gid_eq(attr->ia_gid, inode->i_gid))) &&
- !capable_wrt_inode_uidgid(inode, CAP_CHOWN))
+ if ((ia_valid & ATTR_GID) && !chgrp_ok(inode, attr->ia_gid))
return -EPERM;
/* Make sure a caller can chmod. */
diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
index 16405ae88d2d..2e97a337ee5f 100644
--- a/fs/kernfs/inode.c
+++ b/fs/kernfs/inode.c
@@ -117,6 +117,8 @@ int kernfs_iop_setattr(struct dentry *dentry, struct iattr *iattr)
if (!kn)
return -EINVAL;
+ if (!uid_valid(inode->i_uid) || !gid_valid(inode->i_gid))
+ return -EPERM;
mutex_lock(&kernfs_mutex);
error = inode_change_ok(inode, iattr);
diff --git a/fs/proc/base.c b/fs/proc/base.c
index b1755b23893e..648d623e2158 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -711,6 +711,8 @@ int proc_setattr(struct dentry *dentry, struct iattr *attr)
if (attr->ia_valid & ATTR_MODE)
return -EPERM;
+ if (!uid_valid(inode->i_uid) || !gid_valid(inode->i_gid))
+ return -EPERM;
error = inode_change_ok(inode, attr);
if (error)
diff --git a/fs/proc/generic.c b/fs/proc/generic.c
index ff3ffc76a937..1461570c552c 100644
--- a/fs/proc/generic.c
+++ b/fs/proc/generic.c
@@ -105,6 +105,9 @@ static int proc_notify_change(struct dentry *dentry, struct iattr *iattr)
struct proc_dir_entry *de = PDE(inode);
int error;
+ if (!uid_valid(inode->i_uid) || !gid_valid(inode->i_gid))
+ return -EPERM;
+
error = inode_change_ok(inode, iattr);
if (error)
return error;
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index fe5b6e6c4671..f5d575157194 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -752,6 +752,8 @@ static int proc_sys_setattr(struct dentry *dentry, struct iattr *attr)
if (attr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID))
return -EPERM;
+ if (!uid_valid(inode->i_uid) || !gid_valid(inode->i_gid))
+ return -EPERM;
error = inode_change_ok(inode, attr);
if (error)
--
2.7.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v4 16/21] fs: Allow superblock owner to access do_remount_sb()
2016-04-26 19:36 ` Seth Forshee
@ 2016-04-26 19:36 ` Seth Forshee
-1 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Alexander Viro
Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee,
dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi,
linux-security-module-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-raid-u79uwXL29TY76Z2rM5mHXA,
fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn,
linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
selinux-+05T5uksL2qpZYMLLGbcSA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov
Superblock level remounts are currently restricted to global
CAP_SYS_ADMIN, as is the path for changing the root mount to
read only on umount. Loosen both of these permission checks to
also allow CAP_SYS_ADMIN in any namespace which is privileged
towards the userns which originally mounted the filesystem.
Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Acked-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
---
fs/namespace.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/namespace.c b/fs/namespace.c
index 0ad8e4a4f50b..575e3f8b34fd 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1510,7 +1510,7 @@ static int do_umount(struct mount *mnt, int flags)
* Special case for "unmounting" root ...
* we just try to remount it readonly.
*/
- if (!capable(CAP_SYS_ADMIN))
+ if (!ns_capable(sb->s_user_ns, CAP_SYS_ADMIN))
return -EPERM;
down_write(&sb->s_umount);
if (!(sb->s_flags & MS_RDONLY))
@@ -2207,7 +2207,7 @@ static int do_remount(struct path *path, int flags, int mnt_flags,
down_write(&sb->s_umount);
if (flags & MS_BIND)
err = change_mount_flags(path->mnt, flags);
- else if (!capable(CAP_SYS_ADMIN))
+ else if (!ns_capable(sb->s_user_ns, CAP_SYS_ADMIN))
err = -EPERM;
else
err = do_remount_sb(sb, flags, data, 0);
--
2.7.4
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v4 16/21] fs: Allow superblock owner to access do_remount_sb()
@ 2016-04-26 19:36 ` Seth Forshee
0 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Alexander Viro
Cc: Serge Hallyn, Richard Weinberger, Austin S Hemmelgarn,
Miklos Szeredi, Pavel Tikhomirov, linux-kernel, linux-bcache,
dm-devel, linux-raid, linux-mtd, linux-fsdevel, fuse-devel,
linux-security-module, selinux, cgroups, Seth Forshee
Superblock level remounts are currently restricted to global
CAP_SYS_ADMIN, as is the path for changing the root mount to
read only on umount. Loosen both of these permission checks to
also allow CAP_SYS_ADMIN in any namespace which is privileged
towards the userns which originally mounted the filesystem.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
---
fs/namespace.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/namespace.c b/fs/namespace.c
index 0ad8e4a4f50b..575e3f8b34fd 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1510,7 +1510,7 @@ static int do_umount(struct mount *mnt, int flags)
* Special case for "unmounting" root ...
* we just try to remount it readonly.
*/
- if (!capable(CAP_SYS_ADMIN))
+ if (!ns_capable(sb->s_user_ns, CAP_SYS_ADMIN))
return -EPERM;
down_write(&sb->s_umount);
if (!(sb->s_flags & MS_RDONLY))
@@ -2207,7 +2207,7 @@ static int do_remount(struct path *path, int flags, int mnt_flags,
down_write(&sb->s_umount);
if (flags & MS_BIND)
err = change_mount_flags(path->mnt, flags);
- else if (!capable(CAP_SYS_ADMIN))
+ else if (!ns_capable(sb->s_user_ns, CAP_SYS_ADMIN))
err = -EPERM;
else
err = do_remount_sb(sb, flags, data, 0);
--
2.7.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v4 17/21] capabilities: Allow privileged user in s_user_ns to set security.* xattrs
2016-04-26 19:36 ` Seth Forshee
@ 2016-04-26 19:36 ` Seth Forshee
-1 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Serge Hallyn, James Morris, Serge E. Hallyn
Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Miklos Szeredi,
Seth Forshee, dm-devel-H+wXaHxf7aLQT0dZR+AlfA,
linux-security-module-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-raid-u79uwXL29TY76Z2rM5mHXA,
fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn,
linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Alexander Viro,
selinux-+05T5uksL2qpZYMLLGbcSA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov
A privileged user in s_user_ns will generally have the ability to
manipulate the backing store and insert security.* xattrs into
the filesystem directly. Therefore the kernel must be prepared to
handle these xattrs from unprivileged mounts, and it makes little
sense for commoncap to prevent writing these xattrs to the
filesystem. The capability and LSM code have already been updated
to appropriately handle xattrs from unprivileged mounts, so it
is safe to loosen this restriction on setting xattrs.
The exception to this logic is that writing xattrs to a mounted
filesystem may also cause the LSM inode_post_setxattr or
inode_setsecurity callbacks to be invoked. SELinux will deny the
xattr update by virtue of applying mountpoint labeling to
unprivileged userns mounts, and Smack will deny the writes for
any user without global CAP_MAC_ADMIN, so loosening the
capability check in commoncap is safe in this respect as well.
Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
---
security/commoncap.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/security/commoncap.c b/security/commoncap.c
index e657227d221e..12477afaa8ed 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -664,15 +664,17 @@ int cap_bprm_secureexec(struct linux_binprm *bprm)
int cap_inode_setxattr(struct dentry *dentry, const char *name,
const void *value, size_t size, int flags)
{
+ struct user_namespace *user_ns = dentry->d_sb->s_user_ns;
+
if (!strcmp(name, XATTR_NAME_CAPS)) {
- if (!capable(CAP_SETFCAP))
+ if (!ns_capable(user_ns, CAP_SETFCAP))
return -EPERM;
return 0;
}
if (!strncmp(name, XATTR_SECURITY_PREFIX,
sizeof(XATTR_SECURITY_PREFIX) - 1) &&
- !capable(CAP_SYS_ADMIN))
+ !ns_capable(user_ns, CAP_SYS_ADMIN))
return -EPERM;
return 0;
}
@@ -690,15 +692,17 @@ int cap_inode_setxattr(struct dentry *dentry, const char *name,
*/
int cap_inode_removexattr(struct dentry *dentry, const char *name)
{
+ struct user_namespace *user_ns = dentry->d_sb->s_user_ns;
+
if (!strcmp(name, XATTR_NAME_CAPS)) {
- if (!capable(CAP_SETFCAP))
+ if (!ns_capable(user_ns, CAP_SETFCAP))
return -EPERM;
return 0;
}
if (!strncmp(name, XATTR_SECURITY_PREFIX,
sizeof(XATTR_SECURITY_PREFIX) - 1) &&
- !capable(CAP_SYS_ADMIN))
+ !ns_capable(user_ns, CAP_SYS_ADMIN))
return -EPERM;
return 0;
}
--
2.7.4
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v4 17/21] capabilities: Allow privileged user in s_user_ns to set security.* xattrs
@ 2016-04-26 19:36 ` Seth Forshee
0 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Serge Hallyn, James Morris, Serge E. Hallyn
Cc: Alexander Viro, Richard Weinberger, Austin S Hemmelgarn,
Miklos Szeredi, Pavel Tikhomirov, linux-kernel, linux-bcache,
dm-devel, linux-raid, linux-mtd, linux-fsdevel, fuse-devel,
linux-security-module, selinux, cgroups, Seth Forshee
A privileged user in s_user_ns will generally have the ability to
manipulate the backing store and insert security.* xattrs into
the filesystem directly. Therefore the kernel must be prepared to
handle these xattrs from unprivileged mounts, and it makes little
sense for commoncap to prevent writing these xattrs to the
filesystem. The capability and LSM code have already been updated
to appropriately handle xattrs from unprivileged mounts, so it
is safe to loosen this restriction on setting xattrs.
The exception to this logic is that writing xattrs to a mounted
filesystem may also cause the LSM inode_post_setxattr or
inode_setsecurity callbacks to be invoked. SELinux will deny the
xattr update by virtue of applying mountpoint labeling to
unprivileged userns mounts, and Smack will deny the writes for
any user without global CAP_MAC_ADMIN, so loosening the
capability check in commoncap is safe in this respect as well.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
---
security/commoncap.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/security/commoncap.c b/security/commoncap.c
index e657227d221e..12477afaa8ed 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -664,15 +664,17 @@ int cap_bprm_secureexec(struct linux_binprm *bprm)
int cap_inode_setxattr(struct dentry *dentry, const char *name,
const void *value, size_t size, int flags)
{
+ struct user_namespace *user_ns = dentry->d_sb->s_user_ns;
+
if (!strcmp(name, XATTR_NAME_CAPS)) {
- if (!capable(CAP_SETFCAP))
+ if (!ns_capable(user_ns, CAP_SETFCAP))
return -EPERM;
return 0;
}
if (!strncmp(name, XATTR_SECURITY_PREFIX,
sizeof(XATTR_SECURITY_PREFIX) - 1) &&
- !capable(CAP_SYS_ADMIN))
+ !ns_capable(user_ns, CAP_SYS_ADMIN))
return -EPERM;
return 0;
}
@@ -690,15 +692,17 @@ int cap_inode_setxattr(struct dentry *dentry, const char *name,
*/
int cap_inode_removexattr(struct dentry *dentry, const char *name)
{
+ struct user_namespace *user_ns = dentry->d_sb->s_user_ns;
+
if (!strcmp(name, XATTR_NAME_CAPS)) {
- if (!capable(CAP_SETFCAP))
+ if (!ns_capable(user_ns, CAP_SETFCAP))
return -EPERM;
return 0;
}
if (!strncmp(name, XATTR_SECURITY_PREFIX,
sizeof(XATTR_SECURITY_PREFIX) - 1) &&
- !capable(CAP_SYS_ADMIN))
+ !ns_capable(user_ns, CAP_SYS_ADMIN))
return -EPERM;
return 0;
}
--
2.7.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
[parent not found: <1461699396-33000-18-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>]
* Re: [PATCH v4 17/21] capabilities: Allow privileged user in s_user_ns to set security.* xattrs
2016-04-26 19:36 ` Seth Forshee
@ 2016-04-27 7:22 ` James Morris
-1 siblings, 0 replies; 50+ messages in thread
From: James Morris @ 2016-04-27 7:22 UTC (permalink / raw)
To: Seth Forshee
Cc: Eric W. Biederman, Serge Hallyn, James Morris, Serge E. Hallyn,
Alexander Viro, Richard Weinberger, Austin S Hemmelgarn,
Miklos Szeredi, Pavel Tikhomirov,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-bcache-u79uwXL29TY76Z2rM5mHXA,
dm-devel-H+wXaHxf7aLQT0dZR+AlfA,
linux-raid-u79uwXL29TY76Z2rM5mHXA,
linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
linux-security-module-u79uwXL29TY76Z2rM5mHXA,
selinux-+05T5uksL2qpZYMLLGbcSA, cgroups-u79uwXL29TY76Z2rM5mHXA
On Tue, 26 Apr 2016, Seth Forshee wrote:
> A privileged user in s_user_ns will generally have the ability to
> manipulate the backing store and insert security.* xattrs into
> the filesystem directly. Therefore the kernel must be prepared to
> handle these xattrs from unprivileged mounts, and it makes little
> sense for commoncap to prevent writing these xattrs to the
> filesystem. The capability and LSM code have already been updated
> to appropriately handle xattrs from unprivileged mounts, so it
> is safe to loosen this restriction on setting xattrs.
>
> The exception to this logic is that writing xattrs to a mounted
> filesystem may also cause the LSM inode_post_setxattr or
> inode_setsecurity callbacks to be invoked. SELinux will deny the
> xattr update by virtue of applying mountpoint labeling to
> unprivileged userns mounts, and Smack will deny the writes for
> any user without global CAP_MAC_ADMIN, so loosening the
> capability check in commoncap is safe in this respect as well.
>
> Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Acked-by: James Morris <james.l.morris-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
--
James Morris
<jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org>
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v4 17/21] capabilities: Allow privileged user in s_user_ns to set security.* xattrs
@ 2016-04-27 7:22 ` James Morris
0 siblings, 0 replies; 50+ messages in thread
From: James Morris @ 2016-04-27 7:22 UTC (permalink / raw)
To: Seth Forshee
Cc: Eric W. Biederman, Serge Hallyn, James Morris, Serge E. Hallyn,
Alexander Viro, Richard Weinberger, Austin S Hemmelgarn,
Miklos Szeredi, Pavel Tikhomirov, linux-kernel, linux-bcache,
dm-devel, linux-raid, linux-mtd, linux-fsdevel, fuse-devel,
linux-security-module, selinux, cgroups
On Tue, 26 Apr 2016, Seth Forshee wrote:
> A privileged user in s_user_ns will generally have the ability to
> manipulate the backing store and insert security.* xattrs into
> the filesystem directly. Therefore the kernel must be prepared to
> handle these xattrs from unprivileged mounts, and it makes little
> sense for commoncap to prevent writing these xattrs to the
> filesystem. The capability and LSM code have already been updated
> to appropriately handle xattrs from unprivileged mounts, so it
> is safe to loosen this restriction on setting xattrs.
>
> The exception to this logic is that writing xattrs to a mounted
> filesystem may also cause the LSM inode_post_setxattr or
> inode_setsecurity callbacks to be invoked. SELinux will deny the
> xattr update by virtue of applying mountpoint labeling to
> unprivileged userns mounts, and Smack will deny the writes for
> any user without global CAP_MAC_ADMIN, so loosening the
> capability check in commoncap is safe in this respect as well.
>
> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
> Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Acked-by: James Morris <james.l.morris@oracle.com>
--
James Morris
<jmorris@namei.org>
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH v4 18/21] fuse: Add support for pid namespaces
2016-04-26 19:36 ` Seth Forshee
@ 2016-04-26 19:36 ` Seth Forshee
-1 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Miklos Szeredi
Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee,
dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi,
linux-security-module-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-raid-u79uwXL29TY76Z2rM5mHXA,
fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn,
linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Alexander Viro,
selinux-+05T5uksL2qpZYMLLGbcSA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov
When the userspace process servicing fuse requests is running in
a pid namespace then pids passed via the fuse fd are not being
translated into that process' namespace. Translation is necessary
for the pid to be useful to that process.
Since no use case currently exists for changing namespaces all
translations can be done relative to the pid namespace in use
when fuse_conn_init() is called. For fuse this translates to
mount time, and for cuse this is when /dev/cuse is opened. IO for
this connection from another namespace will return errors.
Requests from processes whose pid cannot be translated into the
target namespace are not permitted, except for requests
allocated via fuse_get_req_nofail_nopages. For no-fail requests
in.h.pid will be 0 if the pid translation fails.
File locking changes based on previous work done by Eric
Biederman.
Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Acked-by: Miklos Szeredi <mszeredi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
fs/fuse/dev.c | 19 +++++++++++++++----
fs/fuse/file.c | 22 +++++++++++++++++-----
fs/fuse/fuse_i.h | 4 ++++
fs/fuse/inode.c | 3 +++
4 files changed, 39 insertions(+), 9 deletions(-)
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index cbece1221417..4e91b2ac25a7 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -19,6 +19,7 @@
#include <linux/pipe_fs_i.h>
#include <linux/swap.h>
#include <linux/splice.h>
+#include <linux/sched.h>
MODULE_ALIAS_MISCDEV(FUSE_MINOR);
MODULE_ALIAS("devname:fuse");
@@ -124,11 +125,11 @@ static void __fuse_put_request(struct fuse_req *req)
atomic_dec(&req->count);
}
-static void fuse_req_init_context(struct fuse_req *req)
+static void fuse_req_init_context(struct fuse_conn *fc, struct fuse_req *req)
{
req->in.h.uid = from_kuid_munged(&init_user_ns, current_fsuid());
req->in.h.gid = from_kgid_munged(&init_user_ns, current_fsgid());
- req->in.h.pid = current->pid;
+ req->in.h.pid = pid_nr_ns(task_pid(current), fc->pid_ns);
}
void fuse_set_initialized(struct fuse_conn *fc)
@@ -181,10 +182,14 @@ static struct fuse_req *__fuse_get_req(struct fuse_conn *fc, unsigned npages,
goto out;
}
- fuse_req_init_context(req);
+ fuse_req_init_context(fc, req);
__set_bit(FR_WAITING, &req->flags);
if (for_background)
__set_bit(FR_BACKGROUND, &req->flags);
+ if (req->in.h.pid == 0) {
+ fuse_put_request(fc, req);
+ return ERR_PTR(-EOVERFLOW);
+ }
return req;
@@ -274,7 +279,7 @@ struct fuse_req *fuse_get_req_nofail_nopages(struct fuse_conn *fc,
if (!req)
req = get_reserved_req(fc, file);
- fuse_req_init_context(req);
+ fuse_req_init_context(fc, req);
__set_bit(FR_WAITING, &req->flags);
__clear_bit(FR_BACKGROUND, &req->flags);
return req;
@@ -1243,6 +1248,9 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
struct fuse_in *in;
unsigned reqsize;
+ if (task_active_pid_ns(current) != fc->pid_ns)
+ return -EIO;
+
restart:
spin_lock(&fiq->waitq.lock);
err = -EAGAIN;
@@ -1872,6 +1880,9 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
struct fuse_req *req;
struct fuse_out_header oh;
+ if (task_active_pid_ns(current) != fc->pid_ns)
+ return -EIO;
+
if (nbytes < sizeof(struct fuse_out_header))
return -EINVAL;
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 719924d6c706..b5c616c5ec98 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -2067,7 +2067,8 @@ static int fuse_direct_mmap(struct file *file, struct vm_area_struct *vma)
return generic_file_mmap(file, vma);
}
-static int convert_fuse_file_lock(const struct fuse_file_lock *ffl,
+static int convert_fuse_file_lock(struct fuse_conn *fc,
+ const struct fuse_file_lock *ffl,
struct file_lock *fl)
{
switch (ffl->type) {
@@ -2082,7 +2083,14 @@ static int convert_fuse_file_lock(const struct fuse_file_lock *ffl,
fl->fl_start = ffl->start;
fl->fl_end = ffl->end;
- fl->fl_pid = ffl->pid;
+
+ /*
+ * Convert pid into the caller's pid namespace. If the pid
+ * does not map into the namespace fl_pid will get set to 0.
+ */
+ rcu_read_lock();
+ fl->fl_pid = pid_vnr(find_pid_ns(ffl->pid, fc->pid_ns));
+ rcu_read_unlock();
break;
default:
@@ -2131,7 +2139,7 @@ static int fuse_getlk(struct file *file, struct file_lock *fl)
args.out.args[0].value = &outarg;
err = fuse_simple_request(fc, &args);
if (!err)
- err = convert_fuse_file_lock(&outarg.lk, fl);
+ err = convert_fuse_file_lock(fc, &outarg.lk, fl);
return err;
}
@@ -2143,7 +2151,8 @@ static int fuse_setlk(struct file *file, struct file_lock *fl, int flock)
FUSE_ARGS(args);
struct fuse_lk_in inarg;
int opcode = (fl->fl_flags & FL_SLEEP) ? FUSE_SETLKW : FUSE_SETLK;
- pid_t pid = fl->fl_type != F_UNLCK ? current->tgid : 0;
+ struct pid *pid = fl->fl_type != F_UNLCK ? task_tgid(current) : NULL;
+ pid_t pid_nr = pid_nr_ns(pid, fc->pid_ns);
int err;
if (fl->fl_lmops && fl->fl_lmops->lm_grant) {
@@ -2155,7 +2164,10 @@ static int fuse_setlk(struct file *file, struct file_lock *fl, int flock)
if (fl->fl_flags & FL_CLOSE)
return 0;
- fuse_lk_fill(&args, file, fl, opcode, pid, flock, &inarg);
+ if (pid && pid_nr == 0)
+ return -EOVERFLOW;
+
+ fuse_lk_fill(&args, file, fl, opcode, pid_nr, flock, &inarg);
err = fuse_simple_request(fc, &args);
/* locking is restartable */
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index eddbe02c4028..9145445a759a 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -23,6 +23,7 @@
#include <linux/poll.h>
#include <linux/workqueue.h>
#include <linux/kref.h>
+#include <linux/pid_namespace.h>
/** Max number of pages that can be used in a single read request */
#define FUSE_MAX_PAGES_PER_REQ 32
@@ -465,6 +466,9 @@ struct fuse_conn {
/** The group id for this mount */
kgid_t group_id;
+ /** The pid namespace for this mount */
+ struct pid_namespace *pid_ns;
+
/** The fuse mount flags for this mount */
unsigned flags;
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 1ce67668a8e1..eade0bfa4488 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -20,6 +20,7 @@
#include <linux/random.h>
#include <linux/sched.h>
#include <linux/exportfs.h>
+#include <linux/pid_namespace.h>
MODULE_AUTHOR("Miklos Szeredi <miklos-sUDqSbJrdHQHWmgEVkV9KA@public.gmane.org>");
MODULE_DESCRIPTION("Filesystem in Userspace");
@@ -609,6 +610,7 @@ void fuse_conn_init(struct fuse_conn *fc)
fc->connected = 1;
fc->attr_version = 1;
get_random_bytes(&fc->scramble_key, sizeof(fc->scramble_key));
+ fc->pid_ns = get_pid_ns(task_active_pid_ns(current));
}
EXPORT_SYMBOL_GPL(fuse_conn_init);
@@ -617,6 +619,7 @@ void fuse_conn_put(struct fuse_conn *fc)
if (atomic_dec_and_test(&fc->count)) {
if (fc->destroy_req)
fuse_request_free(fc->destroy_req);
+ put_pid_ns(fc->pid_ns);
fc->release(fc);
}
}
--
2.7.4
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v4 18/21] fuse: Add support for pid namespaces
@ 2016-04-26 19:36 ` Seth Forshee
0 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Miklos Szeredi
Cc: Alexander Viro, Serge Hallyn, Richard Weinberger,
Austin S Hemmelgarn, Miklos Szeredi, Pavel Tikhomirov,
linux-kernel, linux-bcache, dm-devel, linux-raid, linux-mtd,
linux-fsdevel, fuse-devel, linux-security-module, selinux,
cgroups, Seth Forshee
When the userspace process servicing fuse requests is running in
a pid namespace then pids passed via the fuse fd are not being
translated into that process' namespace. Translation is necessary
for the pid to be useful to that process.
Since no use case currently exists for changing namespaces all
translations can be done relative to the pid namespace in use
when fuse_conn_init() is called. For fuse this translates to
mount time, and for cuse this is when /dev/cuse is opened. IO for
this connection from another namespace will return errors.
Requests from processes whose pid cannot be translated into the
target namespace are not permitted, except for requests
allocated via fuse_get_req_nofail_nopages. For no-fail requests
in.h.pid will be 0 if the pid translation fails.
File locking changes based on previous work done by Eric
Biederman.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Miklos Szeredi <mszeredi@redhat.com>
---
fs/fuse/dev.c | 19 +++++++++++++++----
fs/fuse/file.c | 22 +++++++++++++++++-----
fs/fuse/fuse_i.h | 4 ++++
fs/fuse/inode.c | 3 +++
4 files changed, 39 insertions(+), 9 deletions(-)
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index cbece1221417..4e91b2ac25a7 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -19,6 +19,7 @@
#include <linux/pipe_fs_i.h>
#include <linux/swap.h>
#include <linux/splice.h>
+#include <linux/sched.h>
MODULE_ALIAS_MISCDEV(FUSE_MINOR);
MODULE_ALIAS("devname:fuse");
@@ -124,11 +125,11 @@ static void __fuse_put_request(struct fuse_req *req)
atomic_dec(&req->count);
}
-static void fuse_req_init_context(struct fuse_req *req)
+static void fuse_req_init_context(struct fuse_conn *fc, struct fuse_req *req)
{
req->in.h.uid = from_kuid_munged(&init_user_ns, current_fsuid());
req->in.h.gid = from_kgid_munged(&init_user_ns, current_fsgid());
- req->in.h.pid = current->pid;
+ req->in.h.pid = pid_nr_ns(task_pid(current), fc->pid_ns);
}
void fuse_set_initialized(struct fuse_conn *fc)
@@ -181,10 +182,14 @@ static struct fuse_req *__fuse_get_req(struct fuse_conn *fc, unsigned npages,
goto out;
}
- fuse_req_init_context(req);
+ fuse_req_init_context(fc, req);
__set_bit(FR_WAITING, &req->flags);
if (for_background)
__set_bit(FR_BACKGROUND, &req->flags);
+ if (req->in.h.pid == 0) {
+ fuse_put_request(fc, req);
+ return ERR_PTR(-EOVERFLOW);
+ }
return req;
@@ -274,7 +279,7 @@ struct fuse_req *fuse_get_req_nofail_nopages(struct fuse_conn *fc,
if (!req)
req = get_reserved_req(fc, file);
- fuse_req_init_context(req);
+ fuse_req_init_context(fc, req);
__set_bit(FR_WAITING, &req->flags);
__clear_bit(FR_BACKGROUND, &req->flags);
return req;
@@ -1243,6 +1248,9 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
struct fuse_in *in;
unsigned reqsize;
+ if (task_active_pid_ns(current) != fc->pid_ns)
+ return -EIO;
+
restart:
spin_lock(&fiq->waitq.lock);
err = -EAGAIN;
@@ -1872,6 +1880,9 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
struct fuse_req *req;
struct fuse_out_header oh;
+ if (task_active_pid_ns(current) != fc->pid_ns)
+ return -EIO;
+
if (nbytes < sizeof(struct fuse_out_header))
return -EINVAL;
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 719924d6c706..b5c616c5ec98 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -2067,7 +2067,8 @@ static int fuse_direct_mmap(struct file *file, struct vm_area_struct *vma)
return generic_file_mmap(file, vma);
}
-static int convert_fuse_file_lock(const struct fuse_file_lock *ffl,
+static int convert_fuse_file_lock(struct fuse_conn *fc,
+ const struct fuse_file_lock *ffl,
struct file_lock *fl)
{
switch (ffl->type) {
@@ -2082,7 +2083,14 @@ static int convert_fuse_file_lock(const struct fuse_file_lock *ffl,
fl->fl_start = ffl->start;
fl->fl_end = ffl->end;
- fl->fl_pid = ffl->pid;
+
+ /*
+ * Convert pid into the caller's pid namespace. If the pid
+ * does not map into the namespace fl_pid will get set to 0.
+ */
+ rcu_read_lock();
+ fl->fl_pid = pid_vnr(find_pid_ns(ffl->pid, fc->pid_ns));
+ rcu_read_unlock();
break;
default:
@@ -2131,7 +2139,7 @@ static int fuse_getlk(struct file *file, struct file_lock *fl)
args.out.args[0].value = &outarg;
err = fuse_simple_request(fc, &args);
if (!err)
- err = convert_fuse_file_lock(&outarg.lk, fl);
+ err = convert_fuse_file_lock(fc, &outarg.lk, fl);
return err;
}
@@ -2143,7 +2151,8 @@ static int fuse_setlk(struct file *file, struct file_lock *fl, int flock)
FUSE_ARGS(args);
struct fuse_lk_in inarg;
int opcode = (fl->fl_flags & FL_SLEEP) ? FUSE_SETLKW : FUSE_SETLK;
- pid_t pid = fl->fl_type != F_UNLCK ? current->tgid : 0;
+ struct pid *pid = fl->fl_type != F_UNLCK ? task_tgid(current) : NULL;
+ pid_t pid_nr = pid_nr_ns(pid, fc->pid_ns);
int err;
if (fl->fl_lmops && fl->fl_lmops->lm_grant) {
@@ -2155,7 +2164,10 @@ static int fuse_setlk(struct file *file, struct file_lock *fl, int flock)
if (fl->fl_flags & FL_CLOSE)
return 0;
- fuse_lk_fill(&args, file, fl, opcode, pid, flock, &inarg);
+ if (pid && pid_nr == 0)
+ return -EOVERFLOW;
+
+ fuse_lk_fill(&args, file, fl, opcode, pid_nr, flock, &inarg);
err = fuse_simple_request(fc, &args);
/* locking is restartable */
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index eddbe02c4028..9145445a759a 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -23,6 +23,7 @@
#include <linux/poll.h>
#include <linux/workqueue.h>
#include <linux/kref.h>
+#include <linux/pid_namespace.h>
/** Max number of pages that can be used in a single read request */
#define FUSE_MAX_PAGES_PER_REQ 32
@@ -465,6 +466,9 @@ struct fuse_conn {
/** The group id for this mount */
kgid_t group_id;
+ /** The pid namespace for this mount */
+ struct pid_namespace *pid_ns;
+
/** The fuse mount flags for this mount */
unsigned flags;
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 1ce67668a8e1..eade0bfa4488 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -20,6 +20,7 @@
#include <linux/random.h>
#include <linux/sched.h>
#include <linux/exportfs.h>
+#include <linux/pid_namespace.h>
MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
MODULE_DESCRIPTION("Filesystem in Userspace");
@@ -609,6 +610,7 @@ void fuse_conn_init(struct fuse_conn *fc)
fc->connected = 1;
fc->attr_version = 1;
get_random_bytes(&fc->scramble_key, sizeof(fc->scramble_key));
+ fc->pid_ns = get_pid_ns(task_active_pid_ns(current));
}
EXPORT_SYMBOL_GPL(fuse_conn_init);
@@ -617,6 +619,7 @@ void fuse_conn_put(struct fuse_conn *fc)
if (atomic_dec_and_test(&fc->count)) {
if (fc->destroy_req)
fuse_request_free(fc->destroy_req);
+ put_pid_ns(fc->pid_ns);
fc->release(fc);
}
}
--
2.7.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCH v4 18/21] fuse: Add support for pid namespaces
2016-04-26 19:36 ` Seth Forshee
(?)
@ 2016-07-20 2:44 ` Sheng Yang
[not found] ` <CA+2rt426_pshAauQizcxkfAq16vmEpB4sJ4genW_ucosH3j=zQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
-1 siblings, 1 reply; 50+ messages in thread
From: Sheng Yang @ 2016-07-20 2:44 UTC (permalink / raw)
To: Seth Forshee
Cc: Eric W. Biederman, Miklos Szeredi, Alexander Viro, Serge Hallyn,
Richard Weinberger, Austin S Hemmelgarn, Miklos Szeredi,
Pavel Tikhomirov, kernel list, linux-bcache, dm-devel,
linux-raid, linux-mtd, linux-fsdevel, fuse-devel,
linux-security-module, selinux, cgroups
On Tue, Apr 26, 2016 at 12:36 PM, Seth Forshee
<seth.forshee@canonical.com> wrote:
> When the userspace process servicing fuse requests is running in
> a pid namespace then pids passed via the fuse fd are not being
> translated into that process' namespace. Translation is necessary
> for the pid to be useful to that process.
>
> Since no use case currently exists for changing namespaces all
> translations can be done relative to the pid namespace in use
> when fuse_conn_init() is called. For fuse this translates to
> mount time, and for cuse this is when /dev/cuse is opened. IO for
> this connection from another namespace will return errors.
>
> Requests from processes whose pid cannot be translated into the
> target namespace are not permitted, except for requests
> allocated via fuse_get_req_nofail_nopages. For no-fail requests
> in.h.pid will be 0 if the pid translation fails.
Hi Seth,
This patch caused a regression in our major container use case with
FUSE in Ubuntu 16.04, as patch was checked in as Ubuntu Sauce in
Ubuntu 4.4.0-6.21 kernel.
The use case is:
1. Create a Docker container.
2. Inside the container, start the FUSE backend, and mounted fs.
3. Following step 2 in the container, create a loopback device to map
a file in the mounted fuse to create a block device, which will be
available to the whole system.
It works well before this commit.
The use case is broken because no matter which namespace losetup runs,
the real request from loopback device seems always come from init ns,
thus it will be in different ns running fuse backend. So the request
will got denied, because the ns running fuse won't able to see the
things from higher level(level 0 in fact) pid namespace.
I think since init pid ns has ability to access any process in the
system, it should able to access the fuse mounted by any pid namespace
process as well.
What you think?
--Sheng
>
> File locking changes based on previous work done by Eric
> Biederman.
>
> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
> Acked-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
> fs/fuse/dev.c | 19 +++++++++++++++----
> fs/fuse/file.c | 22 +++++++++++++++++-----
> fs/fuse/fuse_i.h | 4 ++++
> fs/fuse/inode.c | 3 +++
> 4 files changed, 39 insertions(+), 9 deletions(-)
>
> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> index cbece1221417..4e91b2ac25a7 100644
> --- a/fs/fuse/dev.c
> +++ b/fs/fuse/dev.c
> @@ -19,6 +19,7 @@
> #include <linux/pipe_fs_i.h>
> #include <linux/swap.h>
> #include <linux/splice.h>
> +#include <linux/sched.h>
>
> MODULE_ALIAS_MISCDEV(FUSE_MINOR);
> MODULE_ALIAS("devname:fuse");
> @@ -124,11 +125,11 @@ static void __fuse_put_request(struct fuse_req *req)
> atomic_dec(&req->count);
> }
>
> -static void fuse_req_init_context(struct fuse_req *req)
> +static void fuse_req_init_context(struct fuse_conn *fc, struct fuse_req *req)
> {
> req->in.h.uid = from_kuid_munged(&init_user_ns, current_fsuid());
> req->in.h.gid = from_kgid_munged(&init_user_ns, current_fsgid());
> - req->in.h.pid = current->pid;
> + req->in.h.pid = pid_nr_ns(task_pid(current), fc->pid_ns);
> }
>
> void fuse_set_initialized(struct fuse_conn *fc)
> @@ -181,10 +182,14 @@ static struct fuse_req *__fuse_get_req(struct fuse_conn *fc, unsigned npages,
> goto out;
> }
>
> - fuse_req_init_context(req);
> + fuse_req_init_context(fc, req);
> __set_bit(FR_WAITING, &req->flags);
> if (for_background)
> __set_bit(FR_BACKGROUND, &req->flags);
> + if (req->in.h.pid == 0) {
> + fuse_put_request(fc, req);
> + return ERR_PTR(-EOVERFLOW);
> + }
>
> return req;
>
> @@ -274,7 +279,7 @@ struct fuse_req *fuse_get_req_nofail_nopages(struct fuse_conn *fc,
> if (!req)
> req = get_reserved_req(fc, file);
>
> - fuse_req_init_context(req);
> + fuse_req_init_context(fc, req);
> __set_bit(FR_WAITING, &req->flags);
> __clear_bit(FR_BACKGROUND, &req->flags);
> return req;
> @@ -1243,6 +1248,9 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
> struct fuse_in *in;
> unsigned reqsize;
>
> + if (task_active_pid_ns(current) != fc->pid_ns)
> + return -EIO;
> +
> restart:
> spin_lock(&fiq->waitq.lock);
> err = -EAGAIN;
> @@ -1872,6 +1880,9 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
> struct fuse_req *req;
> struct fuse_out_header oh;
>
> + if (task_active_pid_ns(current) != fc->pid_ns)
> + return -EIO;
> +
> if (nbytes < sizeof(struct fuse_out_header))
> return -EINVAL;
>
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 719924d6c706..b5c616c5ec98 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -2067,7 +2067,8 @@ static int fuse_direct_mmap(struct file *file, struct vm_area_struct *vma)
> return generic_file_mmap(file, vma);
> }
>
> -static int convert_fuse_file_lock(const struct fuse_file_lock *ffl,
> +static int convert_fuse_file_lock(struct fuse_conn *fc,
> + const struct fuse_file_lock *ffl,
> struct file_lock *fl)
> {
> switch (ffl->type) {
> @@ -2082,7 +2083,14 @@ static int convert_fuse_file_lock(const struct fuse_file_lock *ffl,
>
> fl->fl_start = ffl->start;
> fl->fl_end = ffl->end;
> - fl->fl_pid = ffl->pid;
> +
> + /*
> + * Convert pid into the caller's pid namespace. If the pid
> + * does not map into the namespace fl_pid will get set to 0.
> + */
> + rcu_read_lock();
> + fl->fl_pid = pid_vnr(find_pid_ns(ffl->pid, fc->pid_ns));
> + rcu_read_unlock();
> break;
>
> default:
> @@ -2131,7 +2139,7 @@ static int fuse_getlk(struct file *file, struct file_lock *fl)
> args.out.args[0].value = &outarg;
> err = fuse_simple_request(fc, &args);
> if (!err)
> - err = convert_fuse_file_lock(&outarg.lk, fl);
> + err = convert_fuse_file_lock(fc, &outarg.lk, fl);
>
> return err;
> }
> @@ -2143,7 +2151,8 @@ static int fuse_setlk(struct file *file, struct file_lock *fl, int flock)
> FUSE_ARGS(args);
> struct fuse_lk_in inarg;
> int opcode = (fl->fl_flags & FL_SLEEP) ? FUSE_SETLKW : FUSE_SETLK;
> - pid_t pid = fl->fl_type != F_UNLCK ? current->tgid : 0;
> + struct pid *pid = fl->fl_type != F_UNLCK ? task_tgid(current) : NULL;
> + pid_t pid_nr = pid_nr_ns(pid, fc->pid_ns);
> int err;
>
> if (fl->fl_lmops && fl->fl_lmops->lm_grant) {
> @@ -2155,7 +2164,10 @@ static int fuse_setlk(struct file *file, struct file_lock *fl, int flock)
> if (fl->fl_flags & FL_CLOSE)
> return 0;
>
> - fuse_lk_fill(&args, file, fl, opcode, pid, flock, &inarg);
> + if (pid && pid_nr == 0)
> + return -EOVERFLOW;
> +
> + fuse_lk_fill(&args, file, fl, opcode, pid_nr, flock, &inarg);
> err = fuse_simple_request(fc, &args);
>
> /* locking is restartable */
> diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> index eddbe02c4028..9145445a759a 100644
> --- a/fs/fuse/fuse_i.h
> +++ b/fs/fuse/fuse_i.h
> @@ -23,6 +23,7 @@
> #include <linux/poll.h>
> #include <linux/workqueue.h>
> #include <linux/kref.h>
> +#include <linux/pid_namespace.h>
>
> /** Max number of pages that can be used in a single read request */
> #define FUSE_MAX_PAGES_PER_REQ 32
> @@ -465,6 +466,9 @@ struct fuse_conn {
> /** The group id for this mount */
> kgid_t group_id;
>
> + /** The pid namespace for this mount */
> + struct pid_namespace *pid_ns;
> +
> /** The fuse mount flags for this mount */
> unsigned flags;
>
> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> index 1ce67668a8e1..eade0bfa4488 100644
> --- a/fs/fuse/inode.c
> +++ b/fs/fuse/inode.c
> @@ -20,6 +20,7 @@
> #include <linux/random.h>
> #include <linux/sched.h>
> #include <linux/exportfs.h>
> +#include <linux/pid_namespace.h>
>
> MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
> MODULE_DESCRIPTION("Filesystem in Userspace");
> @@ -609,6 +610,7 @@ void fuse_conn_init(struct fuse_conn *fc)
> fc->connected = 1;
> fc->attr_version = 1;
> get_random_bytes(&fc->scramble_key, sizeof(fc->scramble_key));
> + fc->pid_ns = get_pid_ns(task_active_pid_ns(current));
> }
> EXPORT_SYMBOL_GPL(fuse_conn_init);
>
> @@ -617,6 +619,7 @@ void fuse_conn_put(struct fuse_conn *fc)
> if (atomic_dec_and_test(&fc->count)) {
> if (fc->destroy_req)
> fuse_request_free(fc->destroy_req);
> + put_pid_ns(fc->pid_ns);
> fc->release(fc);
> }
> }
> --
> 2.7.4
>
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH v4 19/21] fuse: Support fuse filesystems outside of init_user_ns
2016-04-26 19:36 ` Seth Forshee
@ 2016-04-26 19:36 ` Seth Forshee
-1 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Miklos Szeredi
Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee,
dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi,
linux-security-module-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-raid-u79uwXL29TY76Z2rM5mHXA,
fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn,
linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Alexander Viro,
selinux-+05T5uksL2qpZYMLLGbcSA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov
In order to support mounts from namespaces other than
init_user_ns, fuse must translate uids and gids to/from the
userns of the process servicing requests on /dev/fuse. This
patch does that, with a couple of restrictions on the namespace:
- The userns for the fuse connection is fixed to the namespace
from which /dev/fuse is opened.
- The namespace must be the same as s_user_ns.
These restrictions simplify the implementation by avoiding the
need to pass around userns references and by allowing fuse to
rely on the checks in inode_change_ok for ownership changes.
Either restriction could be relaxed in the future if needed.
For cuse the namespace used for the connection is also simply
current_user_ns() at the time /dev/cuse is opened.
Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
---
fs/fuse/cuse.c | 3 ++-
fs/fuse/dev.c | 13 ++++++++-----
fs/fuse/dir.c | 14 +++++++-------
fs/fuse/fuse_i.h | 6 +++++-
fs/fuse/inode.c | 33 +++++++++++++++++++++------------
5 files changed, 43 insertions(+), 26 deletions(-)
diff --git a/fs/fuse/cuse.c b/fs/fuse/cuse.c
index c5b6b7165489..98ebd0f4fd4c 100644
--- a/fs/fuse/cuse.c
+++ b/fs/fuse/cuse.c
@@ -48,6 +48,7 @@
#include <linux/stat.h>
#include <linux/module.h>
#include <linux/uio.h>
+#include <linux/user_namespace.h>
#include "fuse_i.h"
@@ -498,7 +499,7 @@ static int cuse_channel_open(struct inode *inode, struct file *file)
if (!cc)
return -ENOMEM;
- fuse_conn_init(&cc->fc);
+ fuse_conn_init(&cc->fc, current_user_ns());
fud = fuse_dev_alloc(&cc->fc);
if (!fud) {
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 4e91b2ac25a7..8fa1ce934df3 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -127,8 +127,8 @@ static void __fuse_put_request(struct fuse_req *req)
static void fuse_req_init_context(struct fuse_conn *fc, struct fuse_req *req)
{
- req->in.h.uid = from_kuid_munged(&init_user_ns, current_fsuid());
- req->in.h.gid = from_kgid_munged(&init_user_ns, current_fsgid());
+ req->in.h.uid = from_kuid(fc->user_ns, current_fsuid());
+ req->in.h.gid = from_kgid(fc->user_ns, current_fsgid());
req->in.h.pid = pid_nr_ns(task_pid(current), fc->pid_ns);
}
@@ -186,7 +186,8 @@ static struct fuse_req *__fuse_get_req(struct fuse_conn *fc, unsigned npages,
__set_bit(FR_WAITING, &req->flags);
if (for_background)
__set_bit(FR_BACKGROUND, &req->flags);
- if (req->in.h.pid == 0) {
+ if (req->in.h.pid == 0 || req->in.h.uid == (uid_t)-1 ||
+ req->in.h.gid == (gid_t)-1) {
fuse_put_request(fc, req);
return ERR_PTR(-EOVERFLOW);
}
@@ -1248,7 +1249,8 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
struct fuse_in *in;
unsigned reqsize;
- if (task_active_pid_ns(current) != fc->pid_ns)
+ if (task_active_pid_ns(current) != fc->pid_ns ||
+ current_user_ns() != fc->user_ns)
return -EIO;
restart:
@@ -1880,7 +1882,8 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
struct fuse_req *req;
struct fuse_out_header oh;
- if (task_active_pid_ns(current) != fc->pid_ns)
+ if (task_active_pid_ns(current) != fc->pid_ns ||
+ current_user_ns() != fc->user_ns)
return -EIO;
if (nbytes < sizeof(struct fuse_out_header))
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 4b855b65d457..ecba75bf6640 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -841,8 +841,8 @@ static void fuse_fillattr(struct inode *inode, struct fuse_attr *attr,
stat->ino = attr->ino;
stat->mode = (inode->i_mode & S_IFMT) | (attr->mode & 07777);
stat->nlink = attr->nlink;
- stat->uid = make_kuid(&init_user_ns, attr->uid);
- stat->gid = make_kgid(&init_user_ns, attr->gid);
+ stat->uid = make_kuid(fc->user_ns, attr->uid);
+ stat->gid = make_kgid(fc->user_ns, attr->gid);
stat->rdev = inode->i_rdev;
stat->atime.tv_sec = attr->atime;
stat->atime.tv_nsec = attr->atimensec;
@@ -1459,17 +1459,17 @@ static bool update_mtime(unsigned ivalid, bool trust_local_mtime)
return true;
}
-static void iattr_to_fattr(struct iattr *iattr, struct fuse_setattr_in *arg,
- bool trust_local_cmtime)
+static void iattr_to_fattr(struct fuse_conn *fc, struct iattr *iattr,
+ struct fuse_setattr_in *arg, bool trust_local_cmtime)
{
unsigned ivalid = iattr->ia_valid;
if (ivalid & ATTR_MODE)
arg->valid |= FATTR_MODE, arg->mode = iattr->ia_mode;
if (ivalid & ATTR_UID)
- arg->valid |= FATTR_UID, arg->uid = from_kuid(&init_user_ns, iattr->ia_uid);
+ arg->valid |= FATTR_UID, arg->uid = from_kuid(fc->user_ns, iattr->ia_uid);
if (ivalid & ATTR_GID)
- arg->valid |= FATTR_GID, arg->gid = from_kgid(&init_user_ns, iattr->ia_gid);
+ arg->valid |= FATTR_GID, arg->gid = from_kgid(fc->user_ns, iattr->ia_gid);
if (ivalid & ATTR_SIZE)
arg->valid |= FATTR_SIZE, arg->size = iattr->ia_size;
if (ivalid & ATTR_ATIME) {
@@ -1629,7 +1629,7 @@ int fuse_do_setattr(struct inode *inode, struct iattr *attr,
memset(&inarg, 0, sizeof(inarg));
memset(&outarg, 0, sizeof(outarg));
- iattr_to_fattr(attr, &inarg, trust_local_cmtime);
+ iattr_to_fattr(fc, attr, &inarg, trust_local_cmtime);
if (file) {
struct fuse_file *ff = file->private_data;
inarg.valid |= FATTR_FH;
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 9145445a759a..9f4c3c82edd6 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -24,6 +24,7 @@
#include <linux/workqueue.h>
#include <linux/kref.h>
#include <linux/pid_namespace.h>
+#include <linux/user_namespace.h>
/** Max number of pages that can be used in a single read request */
#define FUSE_MAX_PAGES_PER_REQ 32
@@ -469,6 +470,9 @@ struct fuse_conn {
/** The pid namespace for this mount */
struct pid_namespace *pid_ns;
+ /** The user namespace for this mount */
+ struct user_namespace *user_ns;
+
/** The fuse mount flags for this mount */
unsigned flags;
@@ -867,7 +871,7 @@ struct fuse_conn *fuse_conn_get(struct fuse_conn *fc);
/**
* Initialize fuse_conn
*/
-void fuse_conn_init(struct fuse_conn *fc);
+void fuse_conn_init(struct fuse_conn *fc, struct user_namespace *user_ns);
/**
* Release reference to fuse_conn
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index eade0bfa4488..0a771145d853 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -167,8 +167,8 @@ void fuse_change_attributes_common(struct inode *inode, struct fuse_attr *attr,
inode->i_ino = fuse_squash_ino(attr->ino);
inode->i_mode = (inode->i_mode & S_IFMT) | (attr->mode & 07777);
set_nlink(inode, attr->nlink);
- inode->i_uid = make_kuid(&init_user_ns, attr->uid);
- inode->i_gid = make_kgid(&init_user_ns, attr->gid);
+ inode->i_uid = make_kuid(fc->user_ns, attr->uid);
+ inode->i_gid = make_kgid(fc->user_ns, attr->gid);
inode->i_blocks = attr->blocks;
inode->i_atime.tv_sec = attr->atime;
inode->i_atime.tv_nsec = attr->atimensec;
@@ -467,7 +467,8 @@ static int fuse_match_uint(substring_t *s, unsigned int *res)
return err;
}
-static int parse_fuse_opt(char *opt, struct fuse_mount_data *d, int is_bdev)
+static int parse_fuse_opt(char *opt, struct fuse_mount_data *d, int is_bdev,
+ struct user_namespace *user_ns)
{
char *p;
memset(d, 0, sizeof(struct fuse_mount_data));
@@ -503,7 +504,7 @@ static int parse_fuse_opt(char *opt, struct fuse_mount_data *d, int is_bdev)
case OPT_USER_ID:
if (fuse_match_uint(&args[0], &uv))
return 0;
- d->user_id = make_kuid(current_user_ns(), uv);
+ d->user_id = make_kuid(user_ns, uv);
if (!uid_valid(d->user_id))
return 0;
d->user_id_present = 1;
@@ -512,7 +513,7 @@ static int parse_fuse_opt(char *opt, struct fuse_mount_data *d, int is_bdev)
case OPT_GROUP_ID:
if (fuse_match_uint(&args[0], &uv))
return 0;
- d->group_id = make_kgid(current_user_ns(), uv);
+ d->group_id = make_kgid(user_ns, uv);
if (!gid_valid(d->group_id))
return 0;
d->group_id_present = 1;
@@ -555,8 +556,10 @@ static int fuse_show_options(struct seq_file *m, struct dentry *root)
struct super_block *sb = root->d_sb;
struct fuse_conn *fc = get_fuse_conn_super(sb);
- seq_printf(m, ",user_id=%u", from_kuid_munged(&init_user_ns, fc->user_id));
- seq_printf(m, ",group_id=%u", from_kgid_munged(&init_user_ns, fc->group_id));
+ seq_printf(m, ",user_id=%u",
+ from_kuid_munged(fc->user_ns, fc->user_id));
+ seq_printf(m, ",group_id=%u",
+ from_kgid_munged(fc->user_ns, fc->group_id));
if (fc->flags & FUSE_DEFAULT_PERMISSIONS)
seq_puts(m, ",default_permissions");
if (fc->flags & FUSE_ALLOW_OTHER)
@@ -587,7 +590,7 @@ static void fuse_pqueue_init(struct fuse_pqueue *fpq)
fpq->connected = 1;
}
-void fuse_conn_init(struct fuse_conn *fc)
+void fuse_conn_init(struct fuse_conn *fc, struct user_namespace *user_ns)
{
memset(fc, 0, sizeof(*fc));
spin_lock_init(&fc->lock);
@@ -611,6 +614,7 @@ void fuse_conn_init(struct fuse_conn *fc)
fc->attr_version = 1;
get_random_bytes(&fc->scramble_key, sizeof(fc->scramble_key));
fc->pid_ns = get_pid_ns(task_active_pid_ns(current));
+ fc->user_ns = get_user_ns(user_ns);
}
EXPORT_SYMBOL_GPL(fuse_conn_init);
@@ -620,6 +624,7 @@ void fuse_conn_put(struct fuse_conn *fc)
if (fc->destroy_req)
fuse_request_free(fc->destroy_req);
put_pid_ns(fc->pid_ns);
+ put_user_ns(fc->user_ns);
fc->release(fc);
}
}
@@ -1046,7 +1051,7 @@ static int fuse_fill_super(struct super_block *sb, void *data, int silent)
sb->s_flags &= ~(MS_NOSEC | MS_I_VERSION);
- if (!parse_fuse_opt(data, &d, is_bdev))
+ if (!parse_fuse_opt(data, &d, is_bdev, sb->s_user_ns))
goto err;
if (is_bdev) {
@@ -1070,8 +1075,12 @@ static int fuse_fill_super(struct super_block *sb, void *data, int silent)
if (!file)
goto err;
- if ((file->f_op != &fuse_dev_operations) ||
- (file->f_cred->user_ns != &init_user_ns))
+ /*
+ * Require mount to happen from the same user namespace which
+ * opened /dev/fuse to prevent potential attacks.
+ */
+ if (file->f_op != &fuse_dev_operations ||
+ file->f_cred->user_ns != sb->s_user_ns)
goto err_fput;
fc = kmalloc(sizeof(*fc), GFP_KERNEL);
@@ -1079,7 +1088,7 @@ static int fuse_fill_super(struct super_block *sb, void *data, int silent)
if (!fc)
goto err_fput;
- fuse_conn_init(fc);
+ fuse_conn_init(fc, sb->s_user_ns);
fc->release = fuse_free_conn;
fud = fuse_dev_alloc(fc);
--
2.7.4
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v4 19/21] fuse: Support fuse filesystems outside of init_user_ns
@ 2016-04-26 19:36 ` Seth Forshee
0 siblings, 0 replies; 50+ messages in thread
From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw)
To: Eric W. Biederman, Miklos Szeredi
Cc: Alexander Viro, Serge Hallyn, Richard Weinberger,
Austin S Hemmelgarn, Miklos Szeredi, Pavel Tikhomirov,
linux-kernel, linux-bcache, dm-devel, linux-raid, linux-mtd,
linux-fsdevel, fuse-devel, linux-security-module, selinux,
cgroups, Seth Forshee
In order to support mounts from namespaces other than
init_user_ns, fuse must translate uids and gids to/from the
userns of the process servicing requests on /dev/fuse. This
patch does that, with a couple of restrictions on the namespace:
- The userns for the fuse connection is fixed to the namespace
from which /dev/fuse is opened.
- The namespace must be the same as s_user_ns.
These restrictions simplify the implementation by avoiding the
need to pass around userns references and by allowing fuse to
rely on the checks in inode_change_ok for ownership changes.
Either restriction could be relaxed in the future if needed.
For cuse the namespace used for the connection is also simply
current_user_ns() at the time /dev/cuse is opened.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
---
fs/fuse/cuse.c | 3 ++-
fs/fuse/dev.c | 13 ++++++++-----
fs/fuse/dir.c | 14 +++++++-------
fs/fuse/fuse_i.h | 6 +++++-
fs/fuse/inode.c | 33 +++++++++++++++++++++------------
5 files changed, 43 insertions(+), 26 deletions(-)
diff --git a/fs/fuse/cuse.c b/fs/fuse/cuse.c
index c5b6b7165489..98ebd0f4fd4c 100644
--- a/fs/fuse/cuse.c
+++ b/fs/fuse/cuse.c
@@ -48,6 +48,7 @@
#include <linux/stat.h>
#include <linux/module.h>
#include <linux/uio.h>
+#include <linux/user_namespace.h>
#include "fuse_i.h"
@@ -498,7 +499,7 @@ static int cuse_channel_open(struct inode *inode, struct file *file)
if (!cc)
return -ENOMEM;
- fuse_conn_init(&cc->fc);
+ fuse_conn_init(&cc->fc, current_user_ns());
fud = fuse_dev_alloc(&cc->fc);
if (!fud) {
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 4e91b2ac25a7..8fa1ce934df3 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -127,8 +127,8 @@ static void __fuse_put_request(struct fuse_req *req)
static void fuse_req_init_context(struct fuse_conn *fc, struct fuse_req *req)
{
- req->in.h.uid = from_kuid_munged(&init_user_ns, current_fsuid());
- req->in.h.gid = from_kgid_munged(&init_user_ns, current_fsgid());
+ req->in.h.uid = from_kuid(fc->user_ns, current_fsuid());
+ req->in.h.gid = from_kgid(fc->user_ns, current_fsgid());
req->in.h.pid = pid_nr_ns(task_pid(current), fc->pid_ns);
}
@@ -186,7 +186,8 @@ static struct fuse_req *__fuse_get_req(struct fuse_conn *fc, unsigned npages,
__set_bit(FR_WAITING, &req->flags);
if (for_background)
__set_bit(FR_BACKGROUND, &req->flags);
- if (req->in.h.pid == 0) {
+ if (req->in.h.pid == 0 || req->in.h.uid == (uid_t)-1 ||
+ req->in.h.gid == (gid_t)-1) {
fuse_put_request(fc, req);
return ERR_PTR(-EOVERFLOW);
}
@@ -1248,7 +1249,8 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
struct fuse_in *in;
unsigned reqsize;
- if (task_active_pid_ns(current) != fc->pid_ns)
+ if (task_active_pid_ns(current) != fc->pid_ns ||
+ current_user_ns() != fc->user_ns)
return -EIO;
restart:
@@ -1880,7 +1882,8 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
struct fuse_req *req;
struct fuse_out_header oh;
- if (task_active_pid_ns(current) != fc->pid_ns)
+ if (task_active_pid_ns(current) != fc->pid_ns ||
+ current_user_ns() != fc->user_ns)
return -EIO;
if (nbytes < sizeof(struct fuse_out_header))
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 4b855b65d457..ecba75bf6640 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -841,8 +841,8 @@ static void fuse_fillattr(struct inode *inode, struct fuse_attr *attr,
stat->ino = attr->ino;
stat->mode = (inode->i_mode & S_IFMT) | (attr->mode & 07777);
stat->nlink = attr->nlink;
- stat->uid = make_kuid(&init_user_ns, attr->uid);
- stat->gid = make_kgid(&init_user_ns, attr->gid);
+ stat->uid = make_kuid(fc->user_ns, attr->uid);
+ stat->gid = make_kgid(fc->user_ns, attr->gid);
stat->rdev = inode->i_rdev;
stat->atime.tv_sec = attr->atime;
stat->atime.tv_nsec = attr->atimensec;
@@ -1459,17 +1459,17 @@ static bool update_mtime(unsigned ivalid, bool trust_local_mtime)
return true;
}
-static void iattr_to_fattr(struct iattr *iattr, struct fuse_setattr_in *arg,
- bool trust_local_cmtime)
+static void iattr_to_fattr(struct fuse_conn *fc, struct iattr *iattr,
+ struct fuse_setattr_in *arg, bool trust_local_cmtime)
{
unsigned ivalid = iattr->ia_valid;
if (ivalid & ATTR_MODE)
arg->valid |= FATTR_MODE, arg->mode = iattr->ia_mode;
if (ivalid & ATTR_UID)
- arg->valid |= FATTR_UID, arg->uid = from_kuid(&init_user_ns, iattr->ia_uid);
+ arg->valid |= FATTR_UID, arg->uid = from_kuid(fc->user_ns, iattr->ia_uid);
if (ivalid & ATTR_GID)
- arg->valid |= FATTR_GID, arg->gid = from_kgid(&init_user_ns, iattr->ia_gid);
+ arg->valid |= FATTR_GID, arg->gid = from_kgid(fc->user_ns, iattr->ia_gid);
if (ivalid & ATTR_SIZE)
arg->valid |= FATTR_SIZE, arg->size = iattr->ia_size;
if (ivalid & ATTR_ATIME) {
@@ -1629,7 +1629,7 @@ int fuse_do_setattr(struct inode *inode, struct iattr *attr,
memset(&inarg, 0, sizeof(inarg));
memset(&outarg, 0, sizeof(outarg));
- iattr_to_fattr(attr, &inarg, trust_local_cmtime);
+ iattr_to_fattr(fc, attr, &inarg, trust_local_cmtime);
if (file) {
struct fuse_file *ff = file->private_data;
inarg.valid |= FATTR_FH;
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 9145445a759a..9f4c3c82edd6 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -24,6 +24,7 @@
#include <linux/workqueue.h>
#include <linux/kref.h>
#include <linux/pid_namespace.h>
+#include <linux/user_namespace.h>
/** Max number of pages that can be used in a single read request */
#define FUSE_MAX_PAGES_PER_REQ 32
@@ -469,6 +470,9 @@ struct fuse_conn {
/** The pid namespace for this mount */
struct pid_namespace *pid_ns;
+ /** The user namespace for this mount */
+ struct user_namespace *user_ns;
+
/** The fuse mount flags for this mount */
unsigned flags;
@@ -867,7 +871,7 @@ struct fuse_conn *fuse_conn_get(struct fuse_conn *fc);
/**
* Initialize fuse_conn
*/
-void fuse_conn_init(struct fuse_conn *fc);
+void fuse_conn_init(struct fuse_conn *fc, struct user_namespace *user_ns);
/**
* Release reference to fuse_conn
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index eade0bfa4488..0a771145d853 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -167,8 +167,8 @@ void fuse_change_attributes_common(struct inode *inode, struct fuse_attr *attr,
inode->i_ino = fuse_squash_ino(attr->ino);
inode->i_mode = (inode->i_mode & S_IFMT) | (attr->mode & 07777);
set_nlink(inode, attr->nlink);
- inode->i_uid = make_kuid(&init_user_ns, attr->uid);
- inode->i_gid = make_kgid(&init_user_ns, attr->gid);
+ inode->i_uid = make_kuid(fc->user_ns, attr->uid);
+ inode->i_gid = make_kgid(fc->user_ns, attr->gid);
inode->i_blocks = attr->blocks;
inode->i_atime.tv_sec = attr->atime;
inode->i_atime.tv_nsec = attr->atimensec;
@@ -467,7 +467,8 @@ static int fuse_match_uint(substring_t *s, unsigned int *res)
return err;
}
-static int parse_fuse_opt(char *opt, struct fuse_mount_data *d, int is_bdev)
+static int parse_fuse_opt(char *opt, struct fuse_mount_data *d, int is_bdev,
+ struct user_namespace *user_ns)
{
char *p;
memset(d, 0, sizeof(struct fuse_mount_data));
@@ -503,7 +504,7 @@ static int parse_fuse_opt(char *opt, struct fuse_mount_data *d, int is_bdev)
case OPT_USER_ID:
if (fuse_match_uint(&args[0], &uv))
return 0;
- d->user_id = make_kuid(current_user_ns(), uv);
+ d->user_id = make_kuid(user_ns, uv);
if (!uid_valid(d->user_id))
return 0;
d->user_id_present = 1;
@@ -512,7 +513,7 @@ static int parse_fuse_opt(char *opt, struct fuse_mount_data *d, int is_bdev)
case OPT_GROUP_ID:
if (fuse_match_uint(&args[0], &uv))
return 0;
- d->group_id = make_kgid(current_user_ns(), uv);
+ d->group_id = make_kgid(user_ns, uv);
if (!gid_valid(d->group_id))
return 0;
d->group_id_present = 1;
@@ -555,8 +556,10 @@ static int fuse_show_options(struct seq_file *m, struct dentry *root)
struct super_block *sb = root->d_sb;
struct fuse_conn *fc = get_fuse_conn_super(sb);
- seq_printf(m, ",user_id=%u", from_kuid_munged(&init_user_ns, fc->user_id));
- seq_printf(m, ",group_id=%u", from_kgid_munged(&init_user_ns, fc->group_id));
+ seq_printf(m, ",user_id=%u",
+ from_kuid_munged(fc->user_ns, fc->user_id));
+ seq_printf(m, ",group_id=%u",
+ from_kgid_munged(fc->user_ns, fc->group_id));
if (fc->flags & FUSE_DEFAULT_PERMISSIONS)
seq_puts(m, ",default_permissions");
if (fc->flags & FUSE_ALLOW_OTHER)
@@ -587,7 +590,7 @@ static void fuse_pqueue_init(struct fuse_pqueue *fpq)
fpq->connected = 1;
}
-void fuse_conn_init(struct fuse_conn *fc)
+void fuse_conn_init(struct fuse_conn *fc, struct user_namespace *user_ns)
{
memset(fc, 0, sizeof(*fc));
spin_lock_init(&fc->lock);
@@ -611,6 +614,7 @@ void fuse_conn_init(struct fuse_conn *fc)
fc->attr_version = 1;
get_random_bytes(&fc->scramble_key, sizeof(fc->scramble_key));
fc->pid_ns = get_pid_ns(task_active_pid_ns(current));
+ fc->user_ns = get_user_ns(user_ns);
}
EXPORT_SYMBOL_GPL(fuse_conn_init);
@@ -620,6 +624,7 @@ void fuse_conn_put(struct fuse_conn *fc)
if (fc->destroy_req)
fuse_request_free(fc->destroy_req);
put_pid_ns(fc->pid_ns);
+ put_user_ns(fc->user_ns);
fc->release(fc);
}
}
@@ -1046,7 +1051,7 @@ static int fuse_fill_super(struct super_block *sb, void *data, int silent)
sb->s_flags &= ~(MS_NOSEC | MS_I_VERSION);
- if (!parse_fuse_opt(data, &d, is_bdev))
+ if (!parse_fuse_opt(data, &d, is_bdev, sb->s_user_ns))
goto err;
if (is_bdev) {
@@ -1070,8 +1075,12 @@ static int fuse_fill_super(struct super_block *sb, void *data, int silent)
if (!file)
goto err;
- if ((file->f_op != &fuse_dev_operations) ||
- (file->f_cred->user_ns != &init_user_ns))
+ /*
+ * Require mount to happen from the same user namespace which
+ * opened /dev/fuse to prevent potential attacks.
+ */
+ if (file->f_op != &fuse_dev_operations ||
+ file->f_cred->user_ns != sb->s_user_ns)
goto err_fput;
fc = kmalloc(sizeof(*fc), GFP_KERNEL);
@@ -1079,7 +1088,7 @@ static int fuse_fill_super(struct super_block *sb, void *data, int silent)
if (!fc)
goto err_fput;
- fuse_conn_init(fc);
+ fuse_conn_init(fc, sb->s_user_ns);
fc->release = fuse_free_conn;
fud = fuse_dev_alloc(fc);
--
2.7.4
^ permalink raw reply related [flat|nested] 50+ messages in thread