* [PATCH] vfs: bypass may_create_in_sticky check if task has CAP_FOWNER
@ 2022-07-27 12:30 Jeff Layton
2022-07-27 12:37 ` Christian Brauner
0 siblings, 1 reply; 6+ messages in thread
From: Jeff Layton @ 2022-07-27 12:30 UTC (permalink / raw)
To: viro
Cc: linux-fsdevel, linux-nfs, linux-kernel, Yongchen Yang, Christian Brauner
NFS server is exporting a sticky directory (mode 01777) with root
squashing enabled. Client has protect_regular enabled and then tries to
open a file as root in that directory. File is created (with ownership
set to nobody:nobody) but the open syscall returns an error.
The problem is may_create_in_sticky, which rejects the open even though
the file has already been created/opened. Bypass the checks in
may_create_in_sticky if the task has CAP_FOWNER in the given namespace.
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976829
Reported-by: Yongchen Yang <yoyang@redhat.com>
Suggested-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/namei.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/namei.c b/fs/namei.c
index 1f28d3f463c3..170c2396ba29 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1230,7 +1230,8 @@ static int may_create_in_sticky(struct user_namespace *mnt_userns,
(!sysctl_protected_regular && S_ISREG(inode->i_mode)) ||
likely(!(dir_mode & S_ISVTX)) ||
uid_eq(i_uid_into_mnt(mnt_userns, inode), dir_uid) ||
- uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)))
+ uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)) ||
+ ns_capable(mnt_userns, CAP_FOWNER))
return 0;
if (likely(dir_mode & 0002) ||
--
2.37.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] vfs: bypass may_create_in_sticky check if task has CAP_FOWNER
2022-07-27 12:30 [PATCH] vfs: bypass may_create_in_sticky check if task has CAP_FOWNER Jeff Layton
@ 2022-07-27 12:37 ` Christian Brauner
2022-07-27 12:55 ` Jeff Layton
0 siblings, 1 reply; 6+ messages in thread
From: Christian Brauner @ 2022-07-27 12:37 UTC (permalink / raw)
To: Jeff Layton; +Cc: viro, linux-fsdevel, linux-nfs, linux-kernel, Yongchen Yang
On Wed, Jul 27, 2022 at 08:30:48AM -0400, Jeff Layton wrote:
> NFS server is exporting a sticky directory (mode 01777) with root
> squashing enabled. Client has protect_regular enabled and then tries to
> open a file as root in that directory. File is created (with ownership
> set to nobody:nobody) but the open syscall returns an error.
>
> The problem is may_create_in_sticky, which rejects the open even though
> the file has already been created/opened. Bypass the checks in
> may_create_in_sticky if the task has CAP_FOWNER in the given namespace.
>
> Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976829
> Reported-by: Yongchen Yang <yoyang@redhat.com>
> Suggested-by: Christian Brauner <brauner@kernel.org>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
> fs/namei.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index 1f28d3f463c3..170c2396ba29 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1230,7 +1230,8 @@ static int may_create_in_sticky(struct user_namespace *mnt_userns,
> (!sysctl_protected_regular && S_ISREG(inode->i_mode)) ||
> likely(!(dir_mode & S_ISVTX)) ||
> uid_eq(i_uid_into_mnt(mnt_userns, inode), dir_uid) ||
> - uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)))
> + uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)) ||
> + ns_capable(mnt_userns, CAP_FOWNER))
> return 0;
Hm, no. You really want inode_owner_or_capable() here..
You need to verify that you have a mapping for the inode->i_{g,u}id in
question and that you're having CAP_FOWNER in the caller's userns.
I'm pretty sure we should also restrict this to the case were the caller
actually created the file otherwise we introduce a potential issue where
the caller is susceptible to data spoofing. For example, the file was
created by another user racing the caller's O_CREAT.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] vfs: bypass may_create_in_sticky check if task has CAP_FOWNER
2022-07-27 12:37 ` Christian Brauner
@ 2022-07-27 12:55 ` Jeff Layton
2022-07-27 13:16 ` Christian Brauner
0 siblings, 1 reply; 6+ messages in thread
From: Jeff Layton @ 2022-07-27 12:55 UTC (permalink / raw)
To: Christian Brauner
Cc: viro, linux-fsdevel, linux-nfs, linux-kernel, Yongchen Yang
On Wed, 2022-07-27 at 14:37 +0200, Christian Brauner wrote:
> On Wed, Jul 27, 2022 at 08:30:48AM -0400, Jeff Layton wrote:
> > NFS server is exporting a sticky directory (mode 01777) with root
> > squashing enabled. Client has protect_regular enabled and then tries to
> > open a file as root in that directory. File is created (with ownership
> > set to nobody:nobody) but the open syscall returns an error.
> >
> > The problem is may_create_in_sticky, which rejects the open even though
> > the file has already been created/opened. Bypass the checks in
> > may_create_in_sticky if the task has CAP_FOWNER in the given namespace.
> >
> > Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976829
> > Reported-by: Yongchen Yang <yoyang@redhat.com>
> > Suggested-by: Christian Brauner <brauner@kernel.org>
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> > fs/namei.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/namei.c b/fs/namei.c
> > index 1f28d3f463c3..170c2396ba29 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -1230,7 +1230,8 @@ static int may_create_in_sticky(struct user_namespace *mnt_userns,
> > (!sysctl_protected_regular && S_ISREG(inode->i_mode)) ||
> > likely(!(dir_mode & S_ISVTX)) ||
> > uid_eq(i_uid_into_mnt(mnt_userns, inode), dir_uid) ||
> > - uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)))
> > + uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)) ||
> > + ns_capable(mnt_userns, CAP_FOWNER))
> > return 0;
>
> Hm, no. You really want inode_owner_or_capable() here..
> You need to verify that you have a mapping for the inode->i_{g,u}id in
> question and that you're having CAP_FOWNER in the caller's userns.
>
Ok, I should be able to make that change and test it out.
> I'm pretty sure we should also restrict this to the case were the caller
> actually created the file otherwise we introduce a potential issue where
> the caller is susceptible to data spoofing. For example, the file was
> created by another user racing the caller's O_CREAT.
That won't be sufficient to fix the testcase, I think. If a file already
exists in the sticky dir and is owned by nobody:nobody, do we really
want to prevent root from opening it? I wouldn't think so.
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] vfs: bypass may_create_in_sticky check if task has CAP_FOWNER
2022-07-27 12:55 ` Jeff Layton
@ 2022-07-27 13:16 ` Christian Brauner
2022-07-27 13:29 ` Jeff Layton
0 siblings, 1 reply; 6+ messages in thread
From: Christian Brauner @ 2022-07-27 13:16 UTC (permalink / raw)
To: Jeff Layton; +Cc: viro, linux-fsdevel, linux-nfs, linux-kernel, Yongchen Yang
On Wed, Jul 27, 2022 at 08:55:35AM -0400, Jeff Layton wrote:
> On Wed, 2022-07-27 at 14:37 +0200, Christian Brauner wrote:
> > On Wed, Jul 27, 2022 at 08:30:48AM -0400, Jeff Layton wrote:
> > > NFS server is exporting a sticky directory (mode 01777) with root
> > > squashing enabled. Client has protect_regular enabled and then tries to
> > > open a file as root in that directory. File is created (with ownership
> > > set to nobody:nobody) but the open syscall returns an error.
> > >
> > > The problem is may_create_in_sticky, which rejects the open even though
> > > the file has already been created/opened. Bypass the checks in
> > > may_create_in_sticky if the task has CAP_FOWNER in the given namespace.
> > >
> > > Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976829
> > > Reported-by: Yongchen Yang <yoyang@redhat.com>
> > > Suggested-by: Christian Brauner <brauner@kernel.org>
> > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > ---
> > > fs/namei.c | 3 ++-
> > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/fs/namei.c b/fs/namei.c
> > > index 1f28d3f463c3..170c2396ba29 100644
> > > --- a/fs/namei.c
> > > +++ b/fs/namei.c
> > > @@ -1230,7 +1230,8 @@ static int may_create_in_sticky(struct user_namespace *mnt_userns,
> > > (!sysctl_protected_regular && S_ISREG(inode->i_mode)) ||
> > > likely(!(dir_mode & S_ISVTX)) ||
> > > uid_eq(i_uid_into_mnt(mnt_userns, inode), dir_uid) ||
> > > - uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)))
> > > + uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)) ||
> > > + ns_capable(mnt_userns, CAP_FOWNER))
> > > return 0;
> >
> > Hm, no. You really want inode_owner_or_capable() here..
> > You need to verify that you have a mapping for the inode->i_{g,u}id in
> > question and that you're having CAP_FOWNER in the caller's userns.
> >
>
> Ok, I should be able to make that change and test it out.
>
> > I'm pretty sure we should also restrict this to the case were the caller
> > actually created the file otherwise we introduce a potential issue where
> > the caller is susceptible to data spoofing. For example, the file was
> > created by another user racing the caller's O_CREAT.
>
> That won't be sufficient to fix the testcase, I think. If a file already
> exists in the sticky dir and is owned by nobody:nobody, do we really
> want to prevent root from opening it? I wouldn't think so.
Afaict, the whole stick behind the protected_regular thing in
may_create_in_sticky() thing is that you prevent scenarios where you can
be tricked into opening a file that you didn't intend to with O_CREAT.
That's specifically also a protection for root. So say root specifies
O_CREAT but someone beats root to it and creates the file dumping
malicious data in there. The uid_eq() requirement is supposed to prevent
such attacks and it's a sysctl that userspace opted into.
We'd be relaxing that restriction quite a bit if we not just allow newly
created but also pre-existing file to be opened even with the CAP_FOWNER
requirement.
So the dd call should really fail if O_CREAT is passed but the file is
pre-existing, imho. It's a different story if dd created that file and
has CAP_FOWNER imho.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] vfs: bypass may_create_in_sticky check if task has CAP_FOWNER
2022-07-27 13:16 ` Christian Brauner
@ 2022-07-27 13:29 ` Jeff Layton
2022-07-27 13:36 ` Christian Brauner
0 siblings, 1 reply; 6+ messages in thread
From: Jeff Layton @ 2022-07-27 13:29 UTC (permalink / raw)
To: Christian Brauner
Cc: viro, linux-fsdevel, linux-nfs, linux-kernel, Yongchen Yang
On Wed, 2022-07-27 at 15:16 +0200, Christian Brauner wrote:
> On Wed, Jul 27, 2022 at 08:55:35AM -0400, Jeff Layton wrote:
> > On Wed, 2022-07-27 at 14:37 +0200, Christian Brauner wrote:
> > > On Wed, Jul 27, 2022 at 08:30:48AM -0400, Jeff Layton wrote:
> > > > NFS server is exporting a sticky directory (mode 01777) with root
> > > > squashing enabled. Client has protect_regular enabled and then tries to
> > > > open a file as root in that directory. File is created (with ownership
> > > > set to nobody:nobody) but the open syscall returns an error.
> > > >
> > > > The problem is may_create_in_sticky, which rejects the open even though
> > > > the file has already been created/opened. Bypass the checks in
> > > > may_create_in_sticky if the task has CAP_FOWNER in the given namespace.
> > > >
> > > > Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976829
> > > > Reported-by: Yongchen Yang <yoyang@redhat.com>
> > > > Suggested-by: Christian Brauner <brauner@kernel.org>
> > > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > > ---
> > > > fs/namei.c | 3 ++-
> > > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/fs/namei.c b/fs/namei.c
> > > > index 1f28d3f463c3..170c2396ba29 100644
> > > > --- a/fs/namei.c
> > > > +++ b/fs/namei.c
> > > > @@ -1230,7 +1230,8 @@ static int may_create_in_sticky(struct user_namespace *mnt_userns,
> > > > (!sysctl_protected_regular && S_ISREG(inode->i_mode)) ||
> > > > likely(!(dir_mode & S_ISVTX)) ||
> > > > uid_eq(i_uid_into_mnt(mnt_userns, inode), dir_uid) ||
> > > > - uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)))
> > > > + uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)) ||
> > > > + ns_capable(mnt_userns, CAP_FOWNER))
> > > > return 0;
> > >
> > > Hm, no. You really want inode_owner_or_capable() here..
> > > You need to verify that you have a mapping for the inode->i_{g,u}id in
> > > question and that you're having CAP_FOWNER in the caller's userns.
> > >
> >
> > Ok, I should be able to make that change and test it out.
> >
> > > I'm pretty sure we should also restrict this to the case were the caller
> > > actually created the file otherwise we introduce a potential issue where
> > > the caller is susceptible to data spoofing. For example, the file was
> > > created by another user racing the caller's O_CREAT.
> >
> > That won't be sufficient to fix the testcase, I think. If a file already
> > exists in the sticky dir and is owned by nobody:nobody, do we really
> > want to prevent root from opening it? I wouldn't think so.
>
> Afaict, the whole stick behind the protected_regular thing in
> may_create_in_sticky() thing is that you prevent scenarios where you can
> be tricked into opening a file that you didn't intend to with O_CREAT.
>
Yuck. The proper way to get that protection is to use O_EXCL...
> That's specifically also a protection for root. So say root specifies
> O_CREAT but someone beats root to it and creates the file dumping
> malicious data in there. The uid_eq() requirement is supposed to prevent
> such attacks and it's a sysctl that userspace opted into.
>
> We'd be relaxing that restriction quite a bit if we not just allow newly
> created but also pre-existing file to be opened even with the CAP_FOWNER
> requirement.
>
> So the dd call should really fail if O_CREAT is passed but the file is
> pre-existing, imho. It's a different story if dd created that file and
> has CAP_FOWNER imho.
That's pretty nasty. So if I create a file as root in a sticky dir that
doesn't exist, and then close it and try to open it again it'll fail
with -EACCES? That's terribly confusing.
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] vfs: bypass may_create_in_sticky check if task has CAP_FOWNER
2022-07-27 13:29 ` Jeff Layton
@ 2022-07-27 13:36 ` Christian Brauner
0 siblings, 0 replies; 6+ messages in thread
From: Christian Brauner @ 2022-07-27 13:36 UTC (permalink / raw)
To: Jeff Layton; +Cc: viro, linux-fsdevel, linux-nfs, linux-kernel, Yongchen Yang
On Wed, Jul 27, 2022 at 09:29:37AM -0400, Jeff Layton wrote:
> On Wed, 2022-07-27 at 15:16 +0200, Christian Brauner wrote:
> > On Wed, Jul 27, 2022 at 08:55:35AM -0400, Jeff Layton wrote:
> > > On Wed, 2022-07-27 at 14:37 +0200, Christian Brauner wrote:
> > > > On Wed, Jul 27, 2022 at 08:30:48AM -0400, Jeff Layton wrote:
> > > > > NFS server is exporting a sticky directory (mode 01777) with root
> > > > > squashing enabled. Client has protect_regular enabled and then tries to
> > > > > open a file as root in that directory. File is created (with ownership
> > > > > set to nobody:nobody) but the open syscall returns an error.
> > > > >
> > > > > The problem is may_create_in_sticky, which rejects the open even though
> > > > > the file has already been created/opened. Bypass the checks in
> > > > > may_create_in_sticky if the task has CAP_FOWNER in the given namespace.
> > > > >
> > > > > Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976829
> > > > > Reported-by: Yongchen Yang <yoyang@redhat.com>
> > > > > Suggested-by: Christian Brauner <brauner@kernel.org>
> > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > > > ---
> > > > > fs/namei.c | 3 ++-
> > > > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/fs/namei.c b/fs/namei.c
> > > > > index 1f28d3f463c3..170c2396ba29 100644
> > > > > --- a/fs/namei.c
> > > > > +++ b/fs/namei.c
> > > > > @@ -1230,7 +1230,8 @@ static int may_create_in_sticky(struct user_namespace *mnt_userns,
> > > > > (!sysctl_protected_regular && S_ISREG(inode->i_mode)) ||
> > > > > likely(!(dir_mode & S_ISVTX)) ||
> > > > > uid_eq(i_uid_into_mnt(mnt_userns, inode), dir_uid) ||
> > > > > - uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)))
> > > > > + uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)) ||
> > > > > + ns_capable(mnt_userns, CAP_FOWNER))
> > > > > return 0;
> > > >
> > > > Hm, no. You really want inode_owner_or_capable() here..
> > > > You need to verify that you have a mapping for the inode->i_{g,u}id in
> > > > question and that you're having CAP_FOWNER in the caller's userns.
> > > >
> > >
> > > Ok, I should be able to make that change and test it out.
> > >
> > > > I'm pretty sure we should also restrict this to the case were the caller
> > > > actually created the file otherwise we introduce a potential issue where
> > > > the caller is susceptible to data spoofing. For example, the file was
> > > > created by another user racing the caller's O_CREAT.
> > >
> > > That won't be sufficient to fix the testcase, I think. If a file already
> > > exists in the sticky dir and is owned by nobody:nobody, do we really
> > > want to prevent root from opening it? I wouldn't think so.
> >
> > Afaict, the whole stick behind the protected_regular thing in
> > may_create_in_sticky() thing is that you prevent scenarios where you can
> > be tricked into opening a file that you didn't intend to with O_CREAT.
> >
>
> Yuck. The proper way to get that protection is to use O_EXCL...
I'm not saying the interface was a particularly great idea. But it's at
least a sysctl...
>
> > That's specifically also a protection for root. So say root specifies
> > O_CREAT but someone beats root to it and creates the file dumping
> > malicious data in there. The uid_eq() requirement is supposed to prevent
> > such attacks and it's a sysctl that userspace opted into.
> >
> > We'd be relaxing that restriction quite a bit if we not just allow newly
> > created but also pre-existing file to be opened even with the CAP_FOWNER
> > requirement.
> >
> > So the dd call should really fail if O_CREAT is passed but the file is
> > pre-existing, imho. It's a different story if dd created that file and
> > has CAP_FOWNER imho.
>
> That's pretty nasty. So if I create a file as root in a sticky dir that
> doesn't exist, and then close it and try to open it again it'll fail
> with -EACCES? That's terribly confusing.
At least only if you try to re-open with O_CREAT and have this
protected_regular sysctl thingy turned on...
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-07-27 13:36 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-27 12:30 [PATCH] vfs: bypass may_create_in_sticky check if task has CAP_FOWNER Jeff Layton
2022-07-27 12:37 ` Christian Brauner
2022-07-27 12:55 ` Jeff Layton
2022-07-27 13:16 ` Christian Brauner
2022-07-27 13:29 ` Jeff Layton
2022-07-27 13:36 ` Christian Brauner
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.