linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Fix debugfs bind mount regression
@ 2016-03-09 15:18 Seth Forshee
  2016-03-09 15:18 ` [PATCH 1/2] fs: Allow bind mounts with locked children on permaenetly empty directories Seth Forshee
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Seth Forshee @ 2016-03-09 15:18 UTC (permalink / raw)
  To: Eric W. Biederman, linux-fsdevel
  Cc: Serge E. Hallyn, linux-kernel, Seth Forshee

Some full-OS container software bind mounts debugfs into containers to
satisfy the assumptions of older userspaces which expect to be able to
mount debugfs. This regressed in 4.1 due to the addition of tracefs,
which gets automounted in the tracing subdirectory of debugfs. In a
cloned mount namespace the bind mount now fails because the tracefs
mount is a locked child of the debugfs mount.

For new mounts we already make an exception to the "locked child mount"
rule. Directories in psuedo filesystems created for the sole purpose of
being mountpoints are created as permanently empty directories which can
never contain any entries, therefore the kernel can know than any mounts
on these directories are not for security purposes. These mounts are
then excluded from locked mount tests in some circumstances.

The same logic clearly applies to directories created in
debugfs_create_automount(). The following patches update this function
to create permanently empty directories for mountpoints and adds an
exclusion to the tests for bind mounts to exclude child mounts on
permanently empty directories.

Thanks,
Seth

Seth Forshee (2):
  fs: Allow bind mounts with locked children on permaenetly empty
    directories
  debugfs: Make automount point inodes permanently empty

 fs/debugfs/inode.c | 2 +-
 fs/namespace.c     | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] fs: Allow bind mounts with locked children on permaenetly empty directories
  2016-03-09 15:18 [PATCH 0/2] Fix debugfs bind mount regression Seth Forshee
@ 2016-03-09 15:18 ` Seth Forshee
  2016-03-09 20:32   ` Serge E. Hallyn
  2016-03-09 15:18 ` [PATCH 2/2] debugfs: Make automount point inodes permanently empty Seth Forshee
  2016-03-09 20:57 ` [PATCH 0/2] Fix debugfs bind mount regression Eric W. Biederman
  2 siblings, 1 reply; 7+ messages in thread
From: Seth Forshee @ 2016-03-09 15:18 UTC (permalink / raw)
  To: Eric W. Biederman, Alexander Viro
  Cc: Serge E. Hallyn, linux-kernel, linux-fsdevel, Seth Forshee

Forbidding a bind mount due to a locked child on a permanently
empty directory provides no security benefit since the
directory cannot contain any contents which have been overmounted
for security reasons.

Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
---
 fs/namespace.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/namespace.c b/fs/namespace.c
index 4fb1691b4355..930f5557b1d1 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -2065,6 +2065,8 @@ static bool has_locked_children(struct mount *mnt, struct dentry *dentry)
 	list_for_each_entry(child, &mnt->mnt_mounts, mnt_child) {
 		if (!is_subdir(child->mnt_mountpoint, dentry))
 			continue;
+		if (is_empty_dir_inode(child->mnt_mountpoint->d_inode))
+			continue;
 
 		if (child->mnt.mnt_flags & MNT_LOCKED)
 			return true;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] debugfs: Make automount point inodes permanently empty
  2016-03-09 15:18 [PATCH 0/2] Fix debugfs bind mount regression Seth Forshee
  2016-03-09 15:18 ` [PATCH 1/2] fs: Allow bind mounts with locked children on permaenetly empty directories Seth Forshee
@ 2016-03-09 15:18 ` Seth Forshee
  2016-03-09 20:32   ` Serge E. Hallyn
  2016-03-09 20:57 ` [PATCH 0/2] Fix debugfs bind mount regression Eric W. Biederman
  2 siblings, 1 reply; 7+ messages in thread
From: Seth Forshee @ 2016-03-09 15:18 UTC (permalink / raw)
  To: Eric W. Biederman, Greg Kroah-Hartman
  Cc: Serge E. Hallyn, linux-kernel, linux-fsdevel, Seth Forshee

Starting with 4.1 the tracing subsystem has its own filesystem
which is automounted in the tracing subdirectory of debugfs.
Prior to this debugfs could be bind mounted in a cloned mount
namespace, but if tracefs has been mounted under debugfs this
now fails because there is a locked child mount. This creates
a regression for container software which bind mounts debugfs
to satisfy the assumption of some userspace software.

In other pseudo filesystems such as proc and sysfs we're already
creating mountpoints like this in such a way that no dirents can
be created in the directories, allowing them to be exceptions to
some MNT_LOCKED tests. In fact we're already do this for the
tracefs mountpoint in sysfs.

Do the same in debugfs_create_automount(), since the intention
here is clearly to create a mountpoint. This fixes the regression,
as locked child mounts on permanently empty directories do not
cause a bind mount to fail.

Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
---
 fs/debugfs/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index bece948b363d..8580831ed237 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -457,7 +457,7 @@ struct dentry *debugfs_create_automount(const char *name,
 	if (unlikely(!inode))
 		return failed_creating(dentry);
 
-	inode->i_mode = S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO;
+	make_empty_dir_inode(inode);
 	inode->i_flags |= S_AUTOMOUNT;
 	inode->i_private = data;
 	dentry->d_fsdata = (void *)f;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] debugfs: Make automount point inodes permanently empty
  2016-03-09 15:18 ` [PATCH 2/2] debugfs: Make automount point inodes permanently empty Seth Forshee
@ 2016-03-09 20:32   ` Serge E. Hallyn
  0 siblings, 0 replies; 7+ messages in thread
From: Serge E. Hallyn @ 2016-03-09 20:32 UTC (permalink / raw)
  To: Seth Forshee
  Cc: Eric W. Biederman, Greg Kroah-Hartman, Serge E. Hallyn,
	linux-kernel, linux-fsdevel

Quoting Seth Forshee (seth.forshee@canonical.com):
> Starting with 4.1 the tracing subsystem has its own filesystem
> which is automounted in the tracing subdirectory of debugfs.
> Prior to this debugfs could be bind mounted in a cloned mount
> namespace, but if tracefs has been mounted under debugfs this
> now fails because there is a locked child mount. This creates
> a regression for container software which bind mounts debugfs
> to satisfy the assumption of some userspace software.
> 
> In other pseudo filesystems such as proc and sysfs we're already
> creating mountpoints like this in such a way that no dirents can
> be created in the directories, allowing them to be exceptions to
> some MNT_LOCKED tests. In fact we're already do this for the
> tracefs mountpoint in sysfs.
> 
> Do the same in debugfs_create_automount(), since the intention
> here is clearly to create a mountpoint. This fixes the regression,
> as locked child mounts on permanently empty directories do not
> cause a bind mount to fail.
> 
> Cc: stable@vger.kernel.org # v4.1+
> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>

> ---
>  fs/debugfs/inode.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
> index bece948b363d..8580831ed237 100644
> --- a/fs/debugfs/inode.c
> +++ b/fs/debugfs/inode.c
> @@ -457,7 +457,7 @@ struct dentry *debugfs_create_automount(const char *name,
>  	if (unlikely(!inode))
>  		return failed_creating(dentry);
>  
> -	inode->i_mode = S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO;
> +	make_empty_dir_inode(inode);
>  	inode->i_flags |= S_AUTOMOUNT;
>  	inode->i_private = data;
>  	dentry->d_fsdata = (void *)f;
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] fs: Allow bind mounts with locked children on permaenetly empty directories
  2016-03-09 15:18 ` [PATCH 1/2] fs: Allow bind mounts with locked children on permaenetly empty directories Seth Forshee
@ 2016-03-09 20:32   ` Serge E. Hallyn
  0 siblings, 0 replies; 7+ messages in thread
From: Serge E. Hallyn @ 2016-03-09 20:32 UTC (permalink / raw)
  To: Seth Forshee
  Cc: Eric W. Biederman, Alexander Viro, Serge E. Hallyn, linux-kernel,
	linux-fsdevel

Quoting Seth Forshee (seth.forshee@canonical.com):
> Forbidding a bind mount due to a locked child on a permanently
> empty directory provides no security benefit since the
> directory cannot contain any contents which have been overmounted
> for security reasons.
> 
> Cc: stable@vger.kernel.org # v4.1+
> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>

> ---
>  fs/namespace.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 4fb1691b4355..930f5557b1d1 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -2065,6 +2065,8 @@ static bool has_locked_children(struct mount *mnt, struct dentry *dentry)
>  	list_for_each_entry(child, &mnt->mnt_mounts, mnt_child) {
>  		if (!is_subdir(child->mnt_mountpoint, dentry))
>  			continue;
> +		if (is_empty_dir_inode(child->mnt_mountpoint->d_inode))
> +			continue;
>  
>  		if (child->mnt.mnt_flags & MNT_LOCKED)
>  			return true;
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] Fix debugfs bind mount regression
  2016-03-09 15:18 [PATCH 0/2] Fix debugfs bind mount regression Seth Forshee
  2016-03-09 15:18 ` [PATCH 1/2] fs: Allow bind mounts with locked children on permaenetly empty directories Seth Forshee
  2016-03-09 15:18 ` [PATCH 2/2] debugfs: Make automount point inodes permanently empty Seth Forshee
@ 2016-03-09 20:57 ` Eric W. Biederman
  2016-03-09 21:18   ` Serge Hallyn
  2 siblings, 1 reply; 7+ messages in thread
From: Eric W. Biederman @ 2016-03-09 20:57 UTC (permalink / raw)
  To: Seth Forshee; +Cc: linux-fsdevel, Serge E. Hallyn, linux-kernel

Seth Forshee <seth.forshee@canonical.com> writes:

> Some full-OS container software bind mounts debugfs into containers to
> satisfy the assumptions of older userspaces which expect to be able to
> mount debugfs. This regressed in 4.1 due to the addition of tracefs,
> which gets automounted in the tracing subdirectory of debugfs. In a
> cloned mount namespace the bind mount now fails because the tracefs
> mount is a locked child of the debugfs mount.
>
> For new mounts we already make an exception to the "locked child mount"
> rule. Directories in psuedo filesystems created for the sole purpose of
> being mountpoints are created as permanently empty directories which can
> never contain any entries, therefore the kernel can know than any mounts
> on these directories are not for security purposes. These mounts are
> then excluded from locked mount tests in some circumstances.
>
> The same logic clearly applies to directories created in
> debugfs_create_automount(). The following patches update this function
> to create permanently empty directories for mountpoints and adds an
> exclusion to the tests for bind mounts to exclude child mounts on
> permanently empty directories.

So I don't know that this approach is bad.  However in reading through
your patch descriptions I do not see any consideration of using
"mount --rbind"  instead of "mount --bind".  AKA adding the MS_REC flag
to your bind mount.

I would think simply using MS_REC would solve this problem, without
needing any additional kernel support.  Am I missing something?

Eric

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] Fix debugfs bind mount regression
  2016-03-09 20:57 ` [PATCH 0/2] Fix debugfs bind mount regression Eric W. Biederman
@ 2016-03-09 21:18   ` Serge Hallyn
  0 siblings, 0 replies; 7+ messages in thread
From: Serge Hallyn @ 2016-03-09 21:18 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Seth Forshee, linux-fsdevel, linux-kernel

Quoting Eric W. Biederman (ebiederm@xmission.com):
> Seth Forshee <seth.forshee@canonical.com> writes:
> 
> > Some full-OS container software bind mounts debugfs into containers to
> > satisfy the assumptions of older userspaces which expect to be able to
> > mount debugfs. This regressed in 4.1 due to the addition of tracefs,
> > which gets automounted in the tracing subdirectory of debugfs. In a
> > cloned mount namespace the bind mount now fails because the tracefs
> > mount is a locked child of the debugfs mount.
> >
> > For new mounts we already make an exception to the "locked child mount"
> > rule. Directories in psuedo filesystems created for the sole purpose of
> > being mountpoints are created as permanently empty directories which can
> > never contain any entries, therefore the kernel can know than any mounts
> > on these directories are not for security purposes. These mounts are
> > then excluded from locked mount tests in some circumstances.
> >
> > The same logic clearly applies to directories created in
> > debugfs_create_automount(). The following patches update this function
> > to create permanently empty directories for mountpoints and adds an
> > exclusion to the tests for bind mounts to exclude child mounts on
> > permanently empty directories.
> 
> So I don't know that this approach is bad.  However in reading through
> your patch descriptions I do not see any consideration of using
> "mount --rbind"  instead of "mount --bind".  AKA adding the MS_REC flag
> to your bind mount.
> 
> I would think simply using MS_REC would solve this problem, without
> needing any additional kernel support.  Am I missing something?

That's what we're doing to work around it fwiw, but it would be nice to
not have to.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-03-09 21:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-09 15:18 [PATCH 0/2] Fix debugfs bind mount regression Seth Forshee
2016-03-09 15:18 ` [PATCH 1/2] fs: Allow bind mounts with locked children on permaenetly empty directories Seth Forshee
2016-03-09 20:32   ` Serge E. Hallyn
2016-03-09 15:18 ` [PATCH 2/2] debugfs: Make automount point inodes permanently empty Seth Forshee
2016-03-09 20:32   ` Serge E. Hallyn
2016-03-09 20:57 ` [PATCH 0/2] Fix debugfs bind mount regression Eric W. Biederman
2016-03-09 21:18   ` Serge Hallyn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).