linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] block: Check ADMIN before NICE for IOPRIO_CLASS_RT
@ 2021-11-15 17:38 Alistair Delva
  2021-11-15 18:04 ` Jens Axboe
  2021-11-15 19:04 ` Ondrej Mosnacek
  0 siblings, 2 replies; 11+ messages in thread
From: Alistair Delva @ 2021-11-15 17:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Khazhismel Kumykov, Bart Van Assche, Serge Hallyn, Jens Axboe,
	Greg Kroah-Hartman, Paul Moore, selinux, linux-security-module,
	kernel-team, stable

Booting to Android userspace on 5.14 or newer triggers the following
SELinux denial:

avc: denied { sys_nice } for comm="init" capability=23
     scontext=u:r:init:s0 tcontext=u:r:init:s0 tclass=capability
     permissive=0

Init is PID 0 running as root, so it already has CAP_SYS_ADMIN. For
better compatibility with older SEPolicy, check ADMIN before NICE.

Fixes: 9d3a39a5f1e4 ("block: grant IOPRIO_CLASS_RT to CAP_SYS_NICE")
Signed-off-by: Alistair Delva <adelva@google.com>
Cc: Khazhismel Kumykov <khazhy@google.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: selinux@vger.kernel.org
Cc: linux-security-module@vger.kernel.org
Cc: kernel-team@android.com
Cc: stable@vger.kernel.org # v5.14+
---
 block/ioprio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/ioprio.c b/block/ioprio.c
index 0e4ff245f2bf..4d59c559e057 100644
--- a/block/ioprio.c
+++ b/block/ioprio.c
@@ -69,7 +69,7 @@ int ioprio_check_cap(int ioprio)
 
 	switch (class) {
 		case IOPRIO_CLASS_RT:
-			if (!capable(CAP_SYS_NICE) && !capable(CAP_SYS_ADMIN))
+			if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_NICE))
 				return -EPERM;
 			fallthrough;
 			/* rt has prio field too */
-- 
2.34.0.rc1.387.gb447b232ab-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] block: Check ADMIN before NICE for IOPRIO_CLASS_RT
  2021-11-15 17:38 [PATCH] block: Check ADMIN before NICE for IOPRIO_CLASS_RT Alistair Delva
@ 2021-11-15 18:04 ` Jens Axboe
  2021-11-15 18:17   ` Alistair Delva
  2021-11-15 19:04 ` Ondrej Mosnacek
  1 sibling, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2021-11-15 18:04 UTC (permalink / raw)
  To: Alistair Delva, linux-kernel
  Cc: Khazhismel Kumykov, Bart Van Assche, Serge Hallyn,
	Greg Kroah-Hartman, Paul Moore, selinux, linux-security-module,
	kernel-team, stable

On 11/15/21 10:38 AM, Alistair Delva wrote:
> Booting to Android userspace on 5.14 or newer triggers the following
> SELinux denial:
> 
> avc: denied { sys_nice } for comm="init" capability=23
>      scontext=u:r:init:s0 tcontext=u:r:init:s0 tclass=capability
>      permissive=0
> 
> Init is PID 0 running as root, so it already has CAP_SYS_ADMIN. For
> better compatibility with older SEPolicy, check ADMIN before NICE.

Seems a bit wonky to me, but the end result is the same. In any case,
this warrants a comment above it detailing why the ordering is
seemingly important.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] block: Check ADMIN before NICE for IOPRIO_CLASS_RT
  2021-11-15 18:04 ` Jens Axboe
@ 2021-11-15 18:17   ` Alistair Delva
  0 siblings, 0 replies; 11+ messages in thread
From: Alistair Delva @ 2021-11-15 18:17 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-kernel, Khazhismel Kumykov, Bart Van Assche, Serge Hallyn,
	Greg Kroah-Hartman, Paul Moore, selinux, linux-security-module,
	kernel-team, stable

On Mon, Nov 15, 2021 at 10:04 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 11/15/21 10:38 AM, Alistair Delva wrote:
> > Booting to Android userspace on 5.14 or newer triggers the following
> > SELinux denial:
> >
> > avc: denied { sys_nice } for comm="init" capability=23
> >      scontext=u:r:init:s0 tcontext=u:r:init:s0 tclass=capability
> >      permissive=0
> >
> > Init is PID 0 running as root, so it already has CAP_SYS_ADMIN. For
> > better compatibility with older SEPolicy, check ADMIN before NICE.
>
> Seems a bit wonky to me, but the end result is the same.

No argument from me..

> In any case,
> this warrants a comment above it detailing why the ordering is
> seemingly important.

Sent v2.

> --
> Jens Axboe
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] block: Check ADMIN before NICE for IOPRIO_CLASS_RT
  2021-11-15 17:38 [PATCH] block: Check ADMIN before NICE for IOPRIO_CLASS_RT Alistair Delva
  2021-11-15 18:04 ` Jens Axboe
@ 2021-11-15 19:04 ` Ondrej Mosnacek
  2021-11-15 19:08   ` Alistair Delva
  2021-11-15 19:31   ` Greg Kroah-Hartman
  1 sibling, 2 replies; 11+ messages in thread
From: Ondrej Mosnacek @ 2021-11-15 19:04 UTC (permalink / raw)
  To: Alistair Delva
  Cc: Linux kernel mailing list, Khazhismel Kumykov, Bart Van Assche,
	Serge Hallyn, Jens Axboe, Greg Kroah-Hartman, Paul Moore,
	SElinux list, Linux Security Module list, Cc: Android Kernel,
	Linux Stable maillist

On Mon, Nov 15, 2021 at 7:14 PM Alistair Delva <adelva@google.com> wrote:
> Booting to Android userspace on 5.14 or newer triggers the following
> SELinux denial:
>
> avc: denied { sys_nice } for comm="init" capability=23
>      scontext=u:r:init:s0 tcontext=u:r:init:s0 tclass=capability
>      permissive=0
>
> Init is PID 0 running as root, so it already has CAP_SYS_ADMIN. For
> better compatibility with older SEPolicy, check ADMIN before NICE.

But with this patch you in turn punish the new/better policies that
try to avoid giving domains CAP_SYS_ADMIN unless necessary (using only
the more granular capabilities wherever possible), which may now get a
bogus sys_admin denial. IMHO the order is better as it is, as it
motivates the "good" policy writing behavior - i.e. spelling out the
capability permissions more explicitly and avoiding CAP_SYS_ADMIN.

IOW, if you domain does CAP_SYS_NICE things, and you didn't explicitly
grant it that (and instead rely on the CAP_SYS_ADMIN fallback), then
the denial correctly flags it as an issue in your policy and
encourages you to add that sys_nice permission to the domain. Then
when one beautiful hypothetical day the CAP_SYS_ADMIN fallback is
removed, your policy will be ready for that and things will keep
working.

Feel free to carry that patch downstream if patching the kernel is
easier for you than fixing the policy, but for the upstream kernel
this is just a step in the wrong direction.

>
> Fixes: 9d3a39a5f1e4 ("block: grant IOPRIO_CLASS_RT to CAP_SYS_NICE")
> Signed-off-by: Alistair Delva <adelva@google.com>
> Cc: Khazhismel Kumykov <khazhy@google.com>
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: Serge Hallyn <serge@hallyn.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Paul Moore <paul@paul-moore.com>
> Cc: selinux@vger.kernel.org
> Cc: linux-security-module@vger.kernel.org
> Cc: kernel-team@android.com
> Cc: stable@vger.kernel.org # v5.14+
> ---
>  block/ioprio.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/block/ioprio.c b/block/ioprio.c
> index 0e4ff245f2bf..4d59c559e057 100644
> --- a/block/ioprio.c
> +++ b/block/ioprio.c
> @@ -69,7 +69,7 @@ int ioprio_check_cap(int ioprio)
>
>         switch (class) {
>                 case IOPRIO_CLASS_RT:
> -                       if (!capable(CAP_SYS_NICE) && !capable(CAP_SYS_ADMIN))
> +                       if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_NICE))
>                                 return -EPERM;
>                         fallthrough;
>                         /* rt has prio field too */
> --
> 2.34.0.rc1.387.gb447b232ab-goog
>

--
Ondrej Mosnacek
Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] block: Check ADMIN before NICE for IOPRIO_CLASS_RT
  2021-11-15 19:04 ` Ondrej Mosnacek
@ 2021-11-15 19:08   ` Alistair Delva
  2021-11-15 21:01     ` Dominick Grift
                       ` (2 more replies)
  2021-11-15 19:31   ` Greg Kroah-Hartman
  1 sibling, 3 replies; 11+ messages in thread
From: Alistair Delva @ 2021-11-15 19:08 UTC (permalink / raw)
  To: Ondrej Mosnacek
  Cc: Linux kernel mailing list, Khazhismel Kumykov, Bart Van Assche,
	Serge Hallyn, Jens Axboe, Greg Kroah-Hartman, Paul Moore,
	SElinux list, Linux Security Module list, Cc: Android Kernel,
	Linux Stable maillist

On Mon, Nov 15, 2021 at 11:04 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
>
> On Mon, Nov 15, 2021 at 7:14 PM Alistair Delva <adelva@google.com> wrote:
> > Booting to Android userspace on 5.14 or newer triggers the following
> > SELinux denial:
> >
> > avc: denied { sys_nice } for comm="init" capability=23
> >      scontext=u:r:init:s0 tcontext=u:r:init:s0 tclass=capability
> >      permissive=0
> >
> > Init is PID 0 running as root, so it already has CAP_SYS_ADMIN. For
> > better compatibility with older SEPolicy, check ADMIN before NICE.
>
> But with this patch you in turn punish the new/better policies that
> try to avoid giving domains CAP_SYS_ADMIN unless necessary (using only
> the more granular capabilities wherever possible), which may now get a
> bogus sys_admin denial. IMHO the order is better as it is, as it
> motivates the "good" policy writing behavior - i.e. spelling out the
> capability permissions more explicitly and avoiding CAP_SYS_ADMIN.
>
> IOW, if you domain does CAP_SYS_NICE things, and you didn't explicitly
> grant it that (and instead rely on the CAP_SYS_ADMIN fallback), then
> the denial correctly flags it as an issue in your policy and
> encourages you to add that sys_nice permission to the domain. Then
> when one beautiful hypothetical day the CAP_SYS_ADMIN fallback is
> removed, your policy will be ready for that and things will keep
> working.
>
> Feel free to carry that patch downstream if patching the kernel is
> easier for you than fixing the policy, but for the upstream kernel
> this is just a step in the wrong direction.

I'm personally fine with this position, but I am curious why "never
break userspace" doesn't apply to SELinux policies. At the end of the
day, booting 5.13 or older, we don't get a denial, and there's nothing
for the sysadmin to do. On 5.14 and newer, we get denials. This is a
common pattern we see each year: some new capability or permission is
required where it wasn't required before, and there's no compatibility
mechanism to grandfather in old policies. So, we have to touch
userspace. If this is just how things are, I can certainly update our
init.te definitions.

> > Fixes: 9d3a39a5f1e4 ("block: grant IOPRIO_CLASS_RT to CAP_SYS_NICE")
> > Signed-off-by: Alistair Delva <adelva@google.com>
> > Cc: Khazhismel Kumykov <khazhy@google.com>
> > Cc: Bart Van Assche <bvanassche@acm.org>
> > Cc: Serge Hallyn <serge@hallyn.com>
> > Cc: Jens Axboe <axboe@kernel.dk>
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: Paul Moore <paul@paul-moore.com>
> > Cc: selinux@vger.kernel.org
> > Cc: linux-security-module@vger.kernel.org
> > Cc: kernel-team@android.com
> > Cc: stable@vger.kernel.org # v5.14+
> > ---
> >  block/ioprio.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/block/ioprio.c b/block/ioprio.c
> > index 0e4ff245f2bf..4d59c559e057 100644
> > --- a/block/ioprio.c
> > +++ b/block/ioprio.c
> > @@ -69,7 +69,7 @@ int ioprio_check_cap(int ioprio)
> >
> >         switch (class) {
> >                 case IOPRIO_CLASS_RT:
> > -                       if (!capable(CAP_SYS_NICE) && !capable(CAP_SYS_ADMIN))
> > +                       if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_NICE))
> >                                 return -EPERM;
> >                         fallthrough;
> >                         /* rt has prio field too */
> > --
> > 2.34.0.rc1.387.gb447b232ab-goog
> >
>
> --
> Ondrej Mosnacek
> Software Engineer, Linux Security - SELinux kernel
> Red Hat, Inc.
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] block: Check ADMIN before NICE for IOPRIO_CLASS_RT
  2021-11-15 19:04 ` Ondrej Mosnacek
  2021-11-15 19:08   ` Alistair Delva
@ 2021-11-15 19:31   ` Greg Kroah-Hartman
  1 sibling, 0 replies; 11+ messages in thread
From: Greg Kroah-Hartman @ 2021-11-15 19:31 UTC (permalink / raw)
  To: Ondrej Mosnacek
  Cc: Alistair Delva, Linux kernel mailing list, Khazhismel Kumykov,
	Bart Van Assche, Serge Hallyn, Jens Axboe, Paul Moore,
	SElinux list, Linux Security Module list, Cc: Android Kernel,
	Linux Stable maillist

On Mon, Nov 15, 2021 at 08:04:05PM +0100, Ondrej Mosnacek wrote:
> On Mon, Nov 15, 2021 at 7:14 PM Alistair Delva <adelva@google.com> wrote:
> > Booting to Android userspace on 5.14 or newer triggers the following
> > SELinux denial:
> >
> > avc: denied { sys_nice } for comm="init" capability=23
> >      scontext=u:r:init:s0 tcontext=u:r:init:s0 tclass=capability
> >      permissive=0
> >
> > Init is PID 0 running as root, so it already has CAP_SYS_ADMIN. For
> > better compatibility with older SEPolicy, check ADMIN before NICE.
> 
> But with this patch you in turn punish the new/better policies that
> try to avoid giving domains CAP_SYS_ADMIN unless necessary (using only
> the more granular capabilities wherever possible), which may now get a
> bogus sys_admin denial. IMHO the order is better as it is, as it
> motivates the "good" policy writing behavior - i.e. spelling out the
> capability permissions more explicitly and avoiding CAP_SYS_ADMIN.
> 
> IOW, if you domain does CAP_SYS_NICE things, and you didn't explicitly
> grant it that (and instead rely on the CAP_SYS_ADMIN fallback), then
> the denial correctly flags it as an issue in your policy and
> encourages you to add that sys_nice permission to the domain. Then
> when one beautiful hypothetical day the CAP_SYS_ADMIN fallback is
> removed, your policy will be ready for that and things will keep
> working.
> 
> Feel free to carry that patch downstream if patching the kernel is
> easier for you than fixing the policy, but for the upstream kernel
> this is just a step in the wrong direction.

So you want to "punish" existing systems by throwing up a warning where
there used to not be one?  That is not nice, you need to handle
upgrading kernels without breaking or causing problems like this.

Yes, SELinux has done this in the past, with many different things, but
that does not mean that it _should_ do this.  Please realize that you do
not want to punish people from upgrading their kernel to a newer
version.  If you do so, they will never upgrade.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] block: Check ADMIN before NICE for IOPRIO_CLASS_RT
  2021-11-15 19:08   ` Alistair Delva
@ 2021-11-15 21:01     ` Dominick Grift
  2021-11-15 21:42     ` Casey Schaufler
  2021-11-16  9:30     ` David Laight
  2 siblings, 0 replies; 11+ messages in thread
From: Dominick Grift @ 2021-11-15 21:01 UTC (permalink / raw)
  To: Alistair Delva
  Cc: Ondrej Mosnacek, Linux kernel mailing list, Khazhismel Kumykov,
	Bart Van Assche, Serge Hallyn, Jens Axboe, Greg Kroah-Hartman,
	Paul Moore, SElinux list, Linux Security Module list,
	Cc: Android Kernel, Linux Stable maillist

Alistair Delva <adelva@google.com> writes:

> On Mon, Nov 15, 2021 at 11:04 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
>>
>> On Mon, Nov 15, 2021 at 7:14 PM Alistair Delva <adelva@google.com> wrote:
>> > Booting to Android userspace on 5.14 or newer triggers the following
>> > SELinux denial:
>> >
>> > avc: denied { sys_nice } for comm="init" capability=23
>> >      scontext=u:r:init:s0 tcontext=u:r:init:s0 tclass=capability
>> >      permissive=0
>> >
>> > Init is PID 0 running as root, so it already has CAP_SYS_ADMIN. For
>> > better compatibility with older SEPolicy, check ADMIN before NICE.
>>
>> But with this patch you in turn punish the new/better policies that
>> try to avoid giving domains CAP_SYS_ADMIN unless necessary (using only
>> the more granular capabilities wherever possible), which may now get a
>> bogus sys_admin denial. IMHO the order is better as it is, as it
>> motivates the "good" policy writing behavior - i.e. spelling out the
>> capability permissions more explicitly and avoiding CAP_SYS_ADMIN.
>>
>> IOW, if you domain does CAP_SYS_NICE things, and you didn't explicitly
>> grant it that (and instead rely on the CAP_SYS_ADMIN fallback), then
>> the denial correctly flags it as an issue in your policy and
>> encourages you to add that sys_nice permission to the domain. Then
>> when one beautiful hypothetical day the CAP_SYS_ADMIN fallback is
>> removed, your policy will be ready for that and things will keep
>> working.
>>
>> Feel free to carry that patch downstream if patching the kernel is
>> easier for you than fixing the policy, but for the upstream kernel
>> this is just a step in the wrong direction.
>
> I'm personally fine with this position, but I am curious why "never
> break userspace" doesn't apply to SELinux policies. At the end of the
> day, booting 5.13 or older, we don't get a denial, and there's nothing
> for the sysadmin to do. On 5.14 and newer, we get denials. This is a
> common pattern we see each year: some new capability or permission is
> required where it wasn't required before, and there's no compatibility
> mechanism to grandfather in old policies. So, we have to touch
> userspace. If this is just how things are, I can certainly update our
> init.te definitions.

User space is not broken? If you just ignore this AVC denial then it
will pass on cap_sys_admin. In other words everything still works, you
only get a AVC denial for cap_sys_nice now.

>
>> > Fixes: 9d3a39a5f1e4 ("block: grant IOPRIO_CLASS_RT to CAP_SYS_NICE")
>> > Signed-off-by: Alistair Delva <adelva@google.com>
>> > Cc: Khazhismel Kumykov <khazhy@google.com>
>> > Cc: Bart Van Assche <bvanassche@acm.org>
>> > Cc: Serge Hallyn <serge@hallyn.com>
>> > Cc: Jens Axboe <axboe@kernel.dk>
>> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> > Cc: Paul Moore <paul@paul-moore.com>
>> > Cc: selinux@vger.kernel.org
>> > Cc: linux-security-module@vger.kernel.org
>> > Cc: kernel-team@android.com
>> > Cc: stable@vger.kernel.org # v5.14+
>> > ---
>> >  block/ioprio.c | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/block/ioprio.c b/block/ioprio.c
>> > index 0e4ff245f2bf..4d59c559e057 100644
>> > --- a/block/ioprio.c
>> > +++ b/block/ioprio.c
>> > @@ -69,7 +69,7 @@ int ioprio_check_cap(int ioprio)
>> >
>> >         switch (class) {
>> >                 case IOPRIO_CLASS_RT:
>> > -                       if (!capable(CAP_SYS_NICE) && !capable(CAP_SYS_ADMIN))
>> > +                       if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_NICE))
>> >                                 return -EPERM;
>> >                         fallthrough;
>> >                         /* rt has prio field too */
>> > --
>> > 2.34.0.rc1.387.gb447b232ab-goog
>> >
>>
>> --
>> Ondrej Mosnacek
>> Software Engineer, Linux Security - SELinux kernel
>> Red Hat, Inc.
>>

-- 
gpg --locate-keys dominick.grift@defensec.nl
Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
Dominick Grift

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] block: Check ADMIN before NICE for IOPRIO_CLASS_RT
  2021-11-15 19:08   ` Alistair Delva
  2021-11-15 21:01     ` Dominick Grift
@ 2021-11-15 21:42     ` Casey Schaufler
  2021-11-16  8:21       ` Greg Kroah-Hartman
  2021-11-16  9:30     ` David Laight
  2 siblings, 1 reply; 11+ messages in thread
From: Casey Schaufler @ 2021-11-15 21:42 UTC (permalink / raw)
  To: Alistair Delva, Ondrej Mosnacek
  Cc: Linux kernel mailing list, Khazhismel Kumykov, Bart Van Assche,
	Serge Hallyn, Jens Axboe, Greg Kroah-Hartman, Paul Moore,
	SElinux list, Linux Security Module list, Cc: Android Kernel,
	Linux Stable maillist, Casey Schaufler

On 11/15/2021 11:08 AM, Alistair Delva wrote:
> On Mon, Nov 15, 2021 at 11:04 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
>> On Mon, Nov 15, 2021 at 7:14 PM Alistair Delva <adelva@google.com> wrote:
>>> Booting to Android userspace on 5.14 or newer triggers the following
>>> SELinux denial:
>>>
>>> avc: denied { sys_nice } for comm="init" capability=23
>>>       scontext=u:r:init:s0 tcontext=u:r:init:s0 tclass=capability
>>>       permissive=0
>>>
>>> Init is PID 0 running as root, so it already has CAP_SYS_ADMIN. For
>>> better compatibility with older SEPolicy, check ADMIN before NICE.
>> But with this patch you in turn punish the new/better policies that
>> try to avoid giving domains CAP_SYS_ADMIN unless necessary (using only
>> the more granular capabilities wherever possible), which may now get a
>> bogus sys_admin denial. IMHO the order is better as it is, as it
>> motivates the "good" policy writing behavior - i.e. spelling out the
>> capability permissions more explicitly and avoiding CAP_SYS_ADMIN.
>>
>> IOW, if you domain does CAP_SYS_NICE things, and you didn't explicitly
>> grant it that (and instead rely on the CAP_SYS_ADMIN fallback), then
>> the denial correctly flags it as an issue in your policy and
>> encourages you to add that sys_nice permission to the domain. Then
>> when one beautiful hypothetical day the CAP_SYS_ADMIN fallback is
>> removed, your policy will be ready for that and things will keep
>> working.
>>
>> Feel free to carry that patch downstream if patching the kernel is
>> easier for you than fixing the policy, but for the upstream kernel
>> this is just a step in the wrong direction.
> I'm personally fine with this position, but I am curious why "never
> break userspace" doesn't apply to SELinux policies.

Because SELinux policy is configuration data, not system code.
One is free to modify SELinux policy to suit one's whims without
making any change to the Linux kernel or its APIs.

>   At the end of the
> day, booting 5.13 or older, we don't get a denial, and there's nothing
> for the sysadmin to do. On 5.14 and newer, we get denials. This is a
> common pattern we see each year: some new capability or permission is
> required where it wasn't required before, and there's no compatibility
> mechanism to grandfather in old policies.

This is an artifact of separating policy from mechanism. The
capability mechanism does not suffer from this issue because
it embodies its policy. SELinux, Smack, AppArmor and "containers"
are vulnerable to it because they explicitly deny the kernel and
kernel developers permission to make assumptions about how they
define "policy".

>   So, we have to touch
> userspace. If this is just how things are, I can certainly update our
> init.te definitions.

If SELinux was a required kernel mechanism and the policy was
included in the kernel tree you might be able to argue that
kernel developers are responsible for changes to SELinux policy.
But it ain't, and it isn't.   By design.

>
>>> Fixes: 9d3a39a5f1e4 ("block: grant IOPRIO_CLASS_RT to CAP_SYS_NICE")
>>> Signed-off-by: Alistair Delva <adelva@google.com>
>>> Cc: Khazhismel Kumykov <khazhy@google.com>
>>> Cc: Bart Van Assche <bvanassche@acm.org>
>>> Cc: Serge Hallyn <serge@hallyn.com>
>>> Cc: Jens Axboe <axboe@kernel.dk>
>>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>> Cc: Paul Moore <paul@paul-moore.com>
>>> Cc: selinux@vger.kernel.org
>>> Cc: linux-security-module@vger.kernel.org
>>> Cc: kernel-team@android.com
>>> Cc: stable@vger.kernel.org # v5.14+
>>> ---
>>>   block/ioprio.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/block/ioprio.c b/block/ioprio.c
>>> index 0e4ff245f2bf..4d59c559e057 100644
>>> --- a/block/ioprio.c
>>> +++ b/block/ioprio.c
>>> @@ -69,7 +69,7 @@ int ioprio_check_cap(int ioprio)
>>>
>>>          switch (class) {
>>>                  case IOPRIO_CLASS_RT:
>>> -                       if (!capable(CAP_SYS_NICE) && !capable(CAP_SYS_ADMIN))
>>> +                       if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_NICE))
>>>                                  return -EPERM;
>>>                          fallthrough;
>>>                          /* rt has prio field too */
>>> --
>>> 2.34.0.rc1.387.gb447b232ab-goog
>>>
>> --
>> Ondrej Mosnacek
>> Software Engineer, Linux Security - SELinux kernel
>> Red Hat, Inc.
>>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] block: Check ADMIN before NICE for IOPRIO_CLASS_RT
  2021-11-15 21:42     ` Casey Schaufler
@ 2021-11-16  8:21       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 11+ messages in thread
From: Greg Kroah-Hartman @ 2021-11-16  8:21 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Alistair Delva, Ondrej Mosnacek, Linux kernel mailing list,
	Khazhismel Kumykov, Bart Van Assche, Serge Hallyn, Jens Axboe,
	Paul Moore, SElinux list, Linux Security Module list,
	Cc: Android Kernel, Linux Stable maillist

On Mon, Nov 15, 2021 at 01:42:54PM -0800, Casey Schaufler wrote:
> On 11/15/2021 11:08 AM, Alistair Delva wrote:
> > On Mon, Nov 15, 2021 at 11:04 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> > > On Mon, Nov 15, 2021 at 7:14 PM Alistair Delva <adelva@google.com> wrote:
> > > > Booting to Android userspace on 5.14 or newer triggers the following
> > > > SELinux denial:
> > > > 
> > > > avc: denied { sys_nice } for comm="init" capability=23
> > > >       scontext=u:r:init:s0 tcontext=u:r:init:s0 tclass=capability
> > > >       permissive=0
> > > > 
> > > > Init is PID 0 running as root, so it already has CAP_SYS_ADMIN. For
> > > > better compatibility with older SEPolicy, check ADMIN before NICE.
> > > But with this patch you in turn punish the new/better policies that
> > > try to avoid giving domains CAP_SYS_ADMIN unless necessary (using only
> > > the more granular capabilities wherever possible), which may now get a
> > > bogus sys_admin denial. IMHO the order is better as it is, as it
> > > motivates the "good" policy writing behavior - i.e. spelling out the
> > > capability permissions more explicitly and avoiding CAP_SYS_ADMIN.
> > > 
> > > IOW, if you domain does CAP_SYS_NICE things, and you didn't explicitly
> > > grant it that (and instead rely on the CAP_SYS_ADMIN fallback), then
> > > the denial correctly flags it as an issue in your policy and
> > > encourages you to add that sys_nice permission to the domain. Then
> > > when one beautiful hypothetical day the CAP_SYS_ADMIN fallback is
> > > removed, your policy will be ready for that and things will keep
> > > working.
> > > 
> > > Feel free to carry that patch downstream if patching the kernel is
> > > easier for you than fixing the policy, but for the upstream kernel
> > > this is just a step in the wrong direction.
> > I'm personally fine with this position, but I am curious why "never
> > break userspace" doesn't apply to SELinux policies.
> 
> Because SELinux policy is configuration data, not system code.
> One is free to modify SELinux policy to suit one's whims without
> making any change to the Linux kernel or its APIs.

Sure, but the problem here is when the kernel is updated and the
userspace configuration is not changed and then the kernel can not boot
or has other problems.  That is a kernel regression.

> >   At the end of the
> > day, booting 5.13 or older, we don't get a denial, and there's nothing
> > for the sysadmin to do. On 5.14 and newer, we get denials. This is a
> > common pattern we see each year: some new capability or permission is
> > required where it wasn't required before, and there's no compatibility
> > mechanism to grandfather in old policies.
> 
> This is an artifact of separating policy from mechanism. The
> capability mechanism does not suffer from this issue because
> it embodies its policy. SELinux, Smack, AppArmor and "containers"
> are vulnerable to it because they explicitly deny the kernel and
> kernel developers permission to make assumptions about how they
> define "policy".
> 
> >   So, we have to touch
> > userspace. If this is just how things are, I can certainly update our
> > init.te definitions.
> 
> If SELinux was a required kernel mechanism and the policy was
> included in the kernel tree you might be able to argue that
> kernel developers are responsible for changes to SELinux policy.
> But it ain't, and it isn't.   By design.

Again, when you change the logic in the kernel that then requires you to
also somehow change userspace files in order to keep the kernel booting
properly, that is a problem.

Same thing if we changed the tty api to require different options to be
handled differently.  There is nothing "special" about security policies
from any other user/kernel interaction and api.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] block: Check ADMIN before NICE for IOPRIO_CLASS_RT
  2021-11-15 19:08   ` Alistair Delva
  2021-11-15 21:01     ` Dominick Grift
  2021-11-15 21:42     ` Casey Schaufler
@ 2021-11-16  9:30     ` David Laight
  2021-11-16 13:36       ` Serge E. Hallyn
  2 siblings, 1 reply; 11+ messages in thread
From: David Laight @ 2021-11-16  9:30 UTC (permalink / raw)
  To: 'Alistair Delva', Ondrej Mosnacek
  Cc: Linux kernel mailing list, Khazhismel Kumykov, Bart Van Assche,
	Serge Hallyn, Jens Axboe, Greg Kroah-Hartman, Paul Moore,
	SElinux list, Linux Security Module list, Cc: Android Kernel,
	Linux Stable maillist

From: Alistair Delva
> Sent: 15 November 2021 19:09
...
> > > -                       if (!capable(CAP_SYS_NICE) && !capable(CAP_SYS_ADMIN))
> > > +                       if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_NICE))
> > >                                 return -EPERM;

Isn't the real problem that you actually want to test:
		if (!capable(CAP_SYS_NICE | CAP_SYS_ADMIN))
			return -EPERM;
so that you only get the fail 'splat' when neither is set.

This will be true whenever more than one capability enables something.

Possibly this needs something like:
int capabale_or(unsigned int, ...);
#define capabale_or(...) capabable_or(__VA_LIST__, ~0u)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] block: Check ADMIN before NICE for IOPRIO_CLASS_RT
  2021-11-16  9:30     ` David Laight
@ 2021-11-16 13:36       ` Serge E. Hallyn
  0 siblings, 0 replies; 11+ messages in thread
From: Serge E. Hallyn @ 2021-11-16 13:36 UTC (permalink / raw)
  To: David Laight
  Cc: 'Alistair Delva',
	Ondrej Mosnacek, Linux kernel mailing list, Khazhismel Kumykov,
	Bart Van Assche, Serge Hallyn, Jens Axboe, Greg Kroah-Hartman,
	Paul Moore, SElinux list, Linux Security Module list,
	Cc: Android Kernel, Linux Stable maillist, john.johansen,
	James Morris, Christian Brauner, Tycho Andersen

On Tue, Nov 16, 2021 at 09:30:12AM +0000, David Laight wrote:
> From: Alistair Delva
> > Sent: 15 November 2021 19:09
> ...
> > > > -                       if (!capable(CAP_SYS_NICE) && !capable(CAP_SYS_ADMIN))
> > > > +                       if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_NICE))
> > > >                                 return -EPERM;
> 
> Isn't the real problem that you actually want to test:
> 		if (!capable(CAP_SYS_NICE | CAP_SYS_ADMIN))
> 			return -EPERM;
> so that you only get the fail 'splat' when neither is set.
> 
> This will be true whenever more than one capability enables something.
> 
> Possibly this needs something like:
> int capabale_or(unsigned int, ...);
> #define capabale_or(...) capabable_or(__VA_LIST__, ~0u)
> 
> 	David

Right, that's what i was suggesting yesterday.  We do this in other
places, where we split off a more fine-grained version of a gross
capability.  If we care enough about the audit messages, then we
probably do need a new primitive.

-serge

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-11-16 13:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-15 17:38 [PATCH] block: Check ADMIN before NICE for IOPRIO_CLASS_RT Alistair Delva
2021-11-15 18:04 ` Jens Axboe
2021-11-15 18:17   ` Alistair Delva
2021-11-15 19:04 ` Ondrej Mosnacek
2021-11-15 19:08   ` Alistair Delva
2021-11-15 21:01     ` Dominick Grift
2021-11-15 21:42     ` Casey Schaufler
2021-11-16  8:21       ` Greg Kroah-Hartman
2021-11-16  9:30     ` David Laight
2021-11-16 13:36       ` Serge E. Hallyn
2021-11-15 19:31   ` Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).