linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] fs: Fix the default values of i_uid/i_gid on /proc/sys inodes.
@ 2019-07-05 16:30 Radoslaw Burny
  2019-07-05 20:02 ` Luis Chamberlain
  0 siblings, 1 reply; 4+ messages in thread
From: Radoslaw Burny @ 2019-07-05 16:30 UTC (permalink / raw)
  To: Luis R . Rodriguez, Kees Cook, Eric W . Biederman, Seth Forshee
  Cc: linux-kernel, linux-fsdevel, jsperbeck, Radoslaw Burny

This also fixes a problem where, in a user namespace without root user
mapping, it is not possible to write to /proc/sys/kernel/shmmax.

The problem was introduced by the combination of the two commits:
* 81754357770ebd900801231e7bc8d151ddc00498: fs: Update
  i_[ug]id_(read|write) to translate relative to s_user_ns
    - this caused the kernel to write INVALID_[UG]ID to i_uid/i_gid
    members of /proc/sys inodes if a containing userns does not have
    entries for root in the uid/gid_map.
* 0bd23d09b874e53bd1a2fe2296030aa2720d7b08: vfs: Don't modify inodes
  with a uid or gid unknown to the vfs
    - changed the kernel to prevent opens for write if the i_uid/i_gid
    field in the inode is invalid

This commit fixes the issue by defaulting i_uid/i_gid to
GLOBAL_ROOT_UID/GID. Note that these values are not used for /proc/sys
access checks, so the change does not otherwise affect /proc semantics.

Tested: Used a repro program that creates a user namespace without any
mapping and stat'ed /proc/$PID/root/proc/sys/kernel/shmmax from outside.
Before the change, it shows the overflow uid, with the change it's 0.

Signed-off-by: Radoslaw Burny <rburny@google.com>
---
Changelog since v1:
- Updated the commit title and description.

 fs/proc/proc_sysctl.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index c74570736b24..36ad1b0d6259 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -499,6 +499,10 @@ static struct inode *proc_sys_make_inode(struct super_block *sb,
 
 	if (root->set_ownership)
 		root->set_ownership(head, table, &inode->i_uid, &inode->i_gid);
+	else {
+		inode->i_uid = GLOBAL_ROOT_UID;
+		inode->i_gid = GLOBAL_ROOT_GID;
+	}
 
 	return inode;
 }
-- 
2.22.0.410.gd8fdbe21b5-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] fs: Fix the default values of i_uid/i_gid on /proc/sys inodes.
  2019-07-05 16:30 [PATCH v2] fs: Fix the default values of i_uid/i_gid on /proc/sys inodes Radoslaw Burny
@ 2019-07-05 20:02 ` Luis Chamberlain
  2019-07-05 22:19   ` Radoslaw Burny
  0 siblings, 1 reply; 4+ messages in thread
From: Luis Chamberlain @ 2019-07-05 20:02 UTC (permalink / raw)
  To: Radoslaw Burny
  Cc: Kees Cook, Eric W . Biederman, Seth Forshee, linux-kernel,
	linux-fsdevel, jsperbeck


Please re-state the main fix in the commit log, not just the subject.

Also, this does not explain why the current values are and the impact to
systems / users. This would help in determine and evaluating if this
deserves to be a stable fix.

On Fri, Jul 05, 2019 at 06:30:21PM +0200, Radoslaw Burny wrote:
> This also fixes a problem where, in a user namespace without root user
> mapping, it is not possible to write to /proc/sys/kernel/shmmax.

This does not explain why that should be possible and what impact this
limitation has.

> The problem was introduced by the combination of the two commits:
> * 81754357770ebd900801231e7bc8d151ddc00498: fs: Update
>   i_[ug]id_(read|write) to translate relative to s_user_ns
>     - this caused the kernel to write INVALID_[UG]ID to i_uid/i_gid
>     members of /proc/sys inodes if a containing userns does not have
>     entries for root in the uid/gid_map.
This is 2014 commit merged as of v4.8.

> * 0bd23d09b874e53bd1a2fe2296030aa2720d7b08: vfs: Don't modify inodes
>   with a uid or gid unknown to the vfs
>     - changed the kernel to prevent opens for write if the i_uid/i_gid
>     field in the inode is invalid

This is a 2016 commit merged as of v4.8 as well...

So regardless of the dates of the commits, are you saying this is a
regression you can confirm did not exist prior to v4.8? Did you test
v4.7 to confirm?

> This commit fixes the issue by defaulting i_uid/i_gid to
> GLOBAL_ROOT_UID/GID.

Why is this right?

> Note that these values are not used for /proc/sys
> access checks, so the change does not otherwise affect /proc semantics.
> 
> Tested: Used a repro program that creates a user namespace without any
> mapping and stat'ed /proc/$PID/root/proc/sys/kernel/shmmax from outside.
> Before the change, it shows the overflow uid, with the change it's 0.

Why is the overflow uid bad for user experience? Did you test prior to
v4.8, ie on v4.7 to confirm this is indeed a regression?

You'd want then to also ammend in the commit log a Fixes:  tag with both
commits listed. If this is a stable fix (criteria yet to be determined),
then we'd need a stable tag.

  Luis

> Signed-off-by: Radoslaw Burny <rburny@google.com>
> ---
> Changelog since v1:
> - Updated the commit title and description.
> 
>  fs/proc/proc_sysctl.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
> index c74570736b24..36ad1b0d6259 100644
> --- a/fs/proc/proc_sysctl.c
> +++ b/fs/proc/proc_sysctl.c
> @@ -499,6 +499,10 @@ static struct inode *proc_sys_make_inode(struct super_block *sb,
>  
>  	if (root->set_ownership)
>  		root->set_ownership(head, table, &inode->i_uid, &inode->i_gid);
> +	else {
> +		inode->i_uid = GLOBAL_ROOT_UID;
> +		inode->i_gid = GLOBAL_ROOT_GID;
> +	}
>  
>  	return inode;
>  }
> -- 
> 2.22.0.410.gd8fdbe21b5-goog
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] fs: Fix the default values of i_uid/i_gid on /proc/sys inodes.
  2019-07-05 20:02 ` Luis Chamberlain
@ 2019-07-05 22:19   ` Radoslaw Burny
  2019-07-05 22:56     ` Luis Chamberlain
  0 siblings, 1 reply; 4+ messages in thread
From: Radoslaw Burny @ 2019-07-05 22:19 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Kees Cook, Eric W . Biederman, Seth Forshee, linux-kernel,
	linux-fsdevel, John Sperbeck

[-- Attachment #1: Type: text/plain, Size: 5575 bytes --]

On Fri, Jul 5, 2019 at 10:02 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
>
> Please re-state the main fix in the commit log, not just the subject.

Sure, I'll do this. Just to make sure - for every iteration on the
commit message, I need to increment the patch "version" and resend the
whole patch, right?

>
> Also, this does not explain why the current values are and the impact to
> systems / users. This would help in determine and evaluating if this
> deserves to be a stable fix.

This commit a (much overdue) resend of https://lkml.org/lkml/2018/11/30/990
I think Eric's comment on the previous thread explained it best:

> We spoke about this at LPC.  And this is the correct behavioral change.
>
> The problem is there is a default value for i_uid and i_gid that is
> correct in the general case.  That default value is not corect for
> sysctl, because proc is weird.  As the sysctl permission check in
> test_perm are all against GLOBAL_ROOT_UID and GLOBAL_ROOT_GID we did not
> notice that i_uid and i_gid were being set wrong.
>
> So all this patch does is fix the default values i_uid and i_gid.

If my new commit message is still not conveying this clearly, feel
free to suggest the specific wording (I'm new to the kernel patch
process, and I might not be explaining the problems well enough).

>
>
> On Fri, Jul 05, 2019 at 06:30:21PM +0200, Radoslaw Burny wrote:
> > This also fixes a problem where, in a user namespace without root user
> > mapping, it is not possible to write to /proc/sys/kernel/shmmax.
>
> This does not explain why that should be possible and what impact this
> limitation has.

Writing to /proc/sys/kernel/shmmax allows setting a shared memory
limit for that container. Since this is usually a part of container's
initial configuration, one would expect that the container's owner /
creator is able to set the limit. Yet, due to the bug described here,
no process can write the container's shmmax if the container's user
namespace does not contain root mapping.

Using a container with no root mapping seems to be a rare case, but we
do use this configuration at Google, which is how I found the issue.
Also, we use a generic tool to configure the container limits, and the
inability to write any of them causes a hard failure.

>
> > The problem was introduced by the combination of the two commits:
> > * 81754357770ebd900801231e7bc8d151ddc00498: fs: Update
> >   i_[ug]id_(read|write) to translate relative to s_user_ns
> >     - this caused the kernel to write INVALID_[UG]ID to i_uid/i_gid
> >     members of /proc/sys inodes if a containing userns does not have
> >     entries for root in the uid/gid_map.
> This is 2014 commit merged as of v4.8.
>
> > * 0bd23d09b874e53bd1a2fe2296030aa2720d7b08: vfs: Don't modify inodes
> >   with a uid or gid unknown to the vfs
> >     - changed the kernel to prevent opens for write if the i_uid/i_gid
> >     field in the inode is invalid
>
> This is a 2016 commit merged as of v4.8 as well...
>
> So regardless of the dates of the commits, are you saying this is a
> regression you can confirm did not exist prior to v4.8? Did you test
> v4.7 to confirm?

I assume no one has noticed this issue before because it requires such
a specific combination of triggers.
Yes, I've tested this with older kernel versions. I've additionally
tested a 4.8 build with just 0aa2720d7b08 reverted, confirming that
the revert fixes the issue.

>
> > This commit fixes the issue by defaulting i_uid/i_gid to
> > GLOBAL_ROOT_UID/GID.
>
> Why is this right?

Quoting Eric: "the sysctl permission check in test_perm are all
against GLOBAL_ROOT_UID and GLOBAL_ROOT_GID".
The values in the inode are not even read during test_perm, but
logically, the inode belongs to the root of the namespace.

>
> > Note that these values are not used for /proc/sys
> > access checks, so the change does not otherwise affect /proc semantics.
> >
> > Tested: Used a repro program that creates a user namespace without any
> > mapping and stat'ed /proc/$PID/root/proc/sys/kernel/shmmax from outside.
> > Before the change, it shows the overflow uid, with the change it's 0.
>
> Why is the overflow uid bad for user experience? Did you test prior to
> v4.8, ie on v4.7 to confirm this is indeed a regression?
>
> You'd want then to also ammend in the commit log a Fixes:  tag with both
> commits listed. If this is a stable fix (criteria yet to be determined),
> then we'd need a stable tag.

The overflow is technically correct; the uid in the inode is invalid,
hence it must be displayed as overflow uid. The fact that the uid is
invalid is the issue.
Logically, this commit fixes 81754357770e (as that commit first
introduced invalid uid/gid values). If you agree, I'll add this to my
updated commit.

>
>   Luis
>
> > Signed-off-by: Radoslaw Burny <rburny@google.com>
> > ---
> > Changelog since v1:
> > - Updated the commit title and description.
> >
> >  fs/proc/proc_sysctl.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
> > index c74570736b24..36ad1b0d6259 100644
> > --- a/fs/proc/proc_sysctl.c
> > +++ b/fs/proc/proc_sysctl.c
> > @@ -499,6 +499,10 @@ static struct inode *proc_sys_make_inode(struct super_block *sb,
> >
> >       if (root->set_ownership)
> >               root->set_ownership(head, table, &inode->i_uid, &inode->i_gid);
> > +     else {
> > +             inode->i_uid = GLOBAL_ROOT_UID;
> > +             inode->i_gid = GLOBAL_ROOT_GID;
> > +     }
> >
> >       return inode;
> >  }
> > --
> > 2.22.0.410.gd8fdbe21b5-goog
> >

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4843 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] fs: Fix the default values of i_uid/i_gid on /proc/sys inodes.
  2019-07-05 22:19   ` Radoslaw Burny
@ 2019-07-05 22:56     ` Luis Chamberlain
  0 siblings, 0 replies; 4+ messages in thread
From: Luis Chamberlain @ 2019-07-05 22:56 UTC (permalink / raw)
  To: Radoslaw Burny
  Cc: Kees Cook, Eric W . Biederman, Seth Forshee, linux-kernel,
	linux-fsdevel, John Sperbeck, Andrew Morton

Please Cc Andrew Morton <akpm@linux-foundation.org> on future follow
ups.

On Sat, Jul 06, 2019 at 12:19:16AM +0200, Radoslaw Burny wrote:
> On Fri, Jul 5, 2019 at 10:02 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
> >
> >
> > Please re-state the main fix in the commit log, not just the subject.
> 
> Sure, I'll do this. Just to make sure - for every iteration on the
> commit message, I need to increment the patch "version" and resend the
> whole patch, right?

Right.

> >
> > Also, this does not explain why the current values are and the impact to
> > systems / users. This would help in determine and evaluating if this
> > deserves to be a stable fix.
> 
> This commit a (much overdue) resend of https://lkml.org/lkml/2018/11/30/990
> I think Eric's comment on the previous thread explained it best:

Ah, I knew this smelled familiar. Yes I recall. Please add more
information about all this to the commit log. The more info, the better
including refence to the old discussion and also a distilled summary of
what was discussed.

Preference if you can avoid using lkml.org and instead use this URL
instead, as lkml.org is not under out control and can die, etc.

https://lore.kernel.org/lkml/20181126172607.125782-1-rburny@google.com/

> > We spoke about this at LPC.  And this is the correct behavioral change.

Again, none of this is clear to the patch reviewer and again you didn't
mention any of it.

> >
> > The problem is there is a default value for i_uid and i_gid that is
> > correct in the general case.  That default value is not corect for
> > sysctl, because proc is weird.  As the sysctl permission check in
> > test_perm are all against GLOBAL_ROOT_UID and GLOBAL_ROOT_GID we did not
> > notice that i_uid and i_gid were being set wrong.
> >
> > So all this patch does is fix the default values i_uid and i_gid.
> 
> If my new commit message is still not conveying this clearly, feel
> free to suggest the specific wording (I'm new to the kernel patch
> process, and I might not be explaining the problems well enough).

Please consense the above into the commit log message. What you want
to be made clear is implication issues if this patch is not applied, who
is affected and why.

> > On Fri, Jul 05, 2019 at 06:30:21PM +0200, Radoslaw Burny wrote:
> > > This also fixes a problem where, in a user namespace without root user
> > > mapping, it is not possible to write to /proc/sys/kernel/shmmax.
> >
> > This does not explain why that should be possible and what impact this
> > limitation has.
> 
> Writing to /proc/sys/kernel/shmmax allows setting a shared memory
> limit for that container. Since this is usually a part of container's
> initial configuration, one would expect that the container's owner /
> creator is able to set the limit. Yet, due to the bug described here,
> no process can write the container's shmmax if the container's user
> namespace does not contain root mapping.

Please include this on the commit log. It does seem then worthy as a
stable commit. Please add the Cc: stable tag, ie put this:

Cc: stable@vger.kernel.org # v4.8+

Right above the Signed-off-by tags.

Then the scripts which pick up stable patches will pick this up.

> Using a container with no root mapping seems to be a rare case, but we
> do use this configuration at Google, which is how I found the issue.
> Also, we use a generic tool to configure the container limits, and the
> inability to write any of them causes a hard failure.

This helps folks also, so please include this in the commit log.

> > > The problem was introduced by the combination of the two commits:
> > > * 81754357770ebd900801231e7bc8d151ddc00498: fs: Update
> > >   i_[ug]id_(read|write) to translate relative to s_user_ns
> > >     - this caused the kernel to write INVALID_[UG]ID to i_uid/i_gid
> > >     members of /proc/sys inodes if a containing userns does not have
> > >     entries for root in the uid/gid_map.
> > This is 2014 commit merged as of v4.8.
> >
> > > * 0bd23d09b874e53bd1a2fe2296030aa2720d7b08: vfs: Don't modify inodes
> > >   with a uid or gid unknown to the vfs
> > >     - changed the kernel to prevent opens for write if the i_uid/i_gid
> > >     field in the inode is invalid
> >
> > This is a 2016 commit merged as of v4.8 as well...
> >
> > So regardless of the dates of the commits, are you saying this is a
> > regression you can confirm did not exist prior to v4.8? Did you test
> > v4.7 to confirm?
> 
> I assume no one has noticed this issue before because it requires such
> a specific combination of triggers.
> Yes, I've tested this with older kernel versions. I've additionally
> tested a 4.8 build with just 0aa2720d7b08 reverted, confirming that
> the revert fixes the issue.

Ummm 0aa2720d7b08 is the last part of the gitsum, you want to reference
the first part of the gitsum as otherwise git show 0aa2720d7b08 yields
nothing, but git show 0bd23d09b874e does.

OK so then the *real* issue was commit 0bd23d09b874e, so Just add this
tag:

Fixes: 0aa2720d7b08 ("vfs: Don't modify inodes with a uid or gid unknown to the vfs")

So does commit 81754357770ebd really have any problem? If not I see no
reason to mention it?

> > > This commit fixes the issue by defaulting i_uid/i_gid to
> > > GLOBAL_ROOT_UID/GID.
> >
> > Why is this right?
> 
> Quoting Eric: "the sysctl permission check in test_perm are all
> against GLOBAL_ROOT_UID and GLOBAL_ROOT_GID".
> The values in the inode are not even read during test_perm, but
> logically, the inode belongs to the root of the namespace.

Please add that to the commit log.

> >
> > > Note that these values are not used for /proc/sys
> > > access checks, so the change does not otherwise affect /proc semantics.
> > >
> > > Tested: Used a repro program that creates a user namespace without any
> > > mapping and stat'ed /proc/$PID/root/proc/sys/kernel/shmmax from outside.
> > > Before the change, it shows the overflow uid, with the change it's 0.
> >
> > Why is the overflow uid bad for user experience? Did you test prior to
> > v4.8, ie on v4.7 to confirm this is indeed a regression?
> >
> > You'd want then to also ammend in the commit log a Fixes:  tag with both
> > commits listed. If this is a stable fix (criteria yet to be determined),
> > then we'd need a stable tag.
> 
> The overflow is technically correct; the uid in the inode is invalid,
> hence it must be displayed as overflow uid. The fact that the uid is
> invalid is the issue.
> Logically, this commit fixes 81754357770e (as that commit first
> introduced invalid uid/gid values). If you agree, I'll add this to my
> updated commit.

Yes add alll that to the commit log and thanks for following up!
Once you post a follow up, I'll review and poke Andrew to merge.

  Luis

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-07-05 22:56 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-05 16:30 [PATCH v2] fs: Fix the default values of i_uid/i_gid on /proc/sys inodes Radoslaw Burny
2019-07-05 20:02 ` Luis Chamberlain
2019-07-05 22:19   ` Radoslaw Burny
2019-07-05 22:56     ` Luis Chamberlain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).