All of lore.kernel.org
 help / color / mirror / Atom feed
* cgroup2 labeling question
@ 2023-03-20  7:23 Dominick Grift
  2023-03-20 13:35 ` Stephen Smalley
  0 siblings, 1 reply; 29+ messages in thread
From: Dominick Grift @ 2023-03-20  7:23 UTC (permalink / raw)
  To: selinux


Hi,

I was reading this pull request [1] and looked into how I might be able
to implement this in policy but there seem to be some technical
difficulties.

* I already use getfscon to seperate the systemd user.slice because the
  system manager delegates the user.slice to the user manager.

  (genfscon "cgroup2" "/user.slice" cgroupfile_context)

  In the past the proved to be a racy where systemd attempts to
  write before the object has the context associated with the genfscon.
  I decided to dontaudit attempts to write to the mislabeled object and
  it *seems* as if systemd retries until it can write it i.e. when the
  object carries the expected label and so that seems to work eventually
  but it looks fragile.

* The challenge with memory pressure implementation [2] is that these
  "memory.pressure" files end up in random locations under
  "/system.slice" for example:

  /sys/fs/cgroup/system.slice/systemd-journald.service/memory.pressure

  Where in the above systemd-journald.service might be
  templated (systemd-journald@FOO.service). Point is that the path is
  random. genfscon does not support regex and glob. I can't do for example:

  (genfscon "cgroup2" "/system.slice/.*/memory.pressure"
  cgroupfile_context)

  Fortunately cgroup2fs supports relabeling but if systemd has to
  manually relabel the cgroup files then I would imagine that this is
  racy as well, and that does not really solve the underlying issue.

  I am looking for ideas and suggestions

[1] https://github.com/SELinuxProject/refpolicy/pull/607
[2] https://github.com/systemd/systemd/blob/main/docs/MEMORY_PRESSURE.md
-- 
gpg --locate-keys dominick.grift@defensec.nl
Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
Dominick Grift

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20  7:23 cgroup2 labeling question Dominick Grift
@ 2023-03-20 13:35 ` Stephen Smalley
  2023-03-20 13:57   ` Dominick Grift
  0 siblings, 1 reply; 29+ messages in thread
From: Stephen Smalley @ 2023-03-20 13:35 UTC (permalink / raw)
  To: Dominick Grift, Ondrej Mosnacek, Paul Moore; +Cc: selinux

On Mon, Mar 20, 2023 at 3:25 AM Dominick Grift
<dominick.grift@defensec.nl> wrote:
>
>
> Hi,
>
> I was reading this pull request [1] and looked into how I might be able
> to implement this in policy but there seem to be some technical
> difficulties.
>
> * I already use getfscon to seperate the systemd user.slice because the
>   system manager delegates the user.slice to the user manager.
>
>   (genfscon "cgroup2" "/user.slice" cgroupfile_context)
>
>   In the past the proved to be a racy where systemd attempts to
>   write before the object has the context associated with the genfscon.

I don't understand how this could be racy - genfscon-assigned contexts
should be assigned when the dentry is first instantiated via
inode_donit_with_dentry and therefore the inode shouldn't be
accessible to userspace prior to this initial assignment AFAIK.
Possibly I am missing something.

>   I decided to dontaudit attempts to write to the mislabeled object and
>   it *seems* as if systemd retries until it can write it i.e. when the
>   object carries the expected label and so that seems to work eventually
>   but it looks fragile.
>
> * The challenge with memory pressure implementation [2] is that these
>   "memory.pressure" files end up in random locations under
>   "/system.slice" for example:
>
>   /sys/fs/cgroup/system.slice/systemd-journald.service/memory.pressure
>
>   Where in the above systemd-journald.service might be
>   templated (systemd-journald@FOO.service). Point is that the path is
>   random. genfscon does not support regex and glob. I can't do for example:
>
>   (genfscon "cgroup2" "/system.slice/.*/memory.pressure"
>   cgroupfile_context)
>
>   Fortunately cgroup2fs supports relabeling but if systemd has to
>   manually relabel the cgroup files then I would imagine that this is
>   racy as well, and that does not really solve the underlying issue.
>
>   I am looking for ideas and suggestions

Optimally one of two things would happen:
1. The kernel would label the inode correctly when it is first created
(e.g. by augmenting genfscon to support more general matching), or
2. The userspace component that creates these files would label them
correctly at creation (via setfscreatecon() prior to creation).

Pardon my ignorance but what creates these files initially? The kernel
in response to some event or systemd or some other userspace
component?

> [1] https://github.com/SELinuxProject/refpolicy/pull/607
> [2] https://github.com/systemd/systemd/blob/main/docs/MEMORY_PRESSURE.md

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 13:35 ` Stephen Smalley
@ 2023-03-20 13:57   ` Dominick Grift
  2023-03-20 14:12     ` Ondrej Mosnacek
  0 siblings, 1 reply; 29+ messages in thread
From: Dominick Grift @ 2023-03-20 13:57 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: Ondrej Mosnacek, Paul Moore, selinux

Stephen Smalley <stephen.smalley.work@gmail.com> writes:

> On Mon, Mar 20, 2023 at 3:25 AM Dominick Grift
> <dominick.grift@defensec.nl> wrote:
>>
>>
>> Hi,
>>
>> I was reading this pull request [1] and looked into how I might be able
>> to implement this in policy but there seem to be some technical
>> difficulties.
>>
>> * I already use getfscon to seperate the systemd user.slice because the
>>   system manager delegates the user.slice to the user manager.
>>
>>   (genfscon "cgroup2" "/user.slice" cgroupfile_context)
>>
>>   In the past the proved to be a racy where systemd attempts to
>>   write before the object has the context associated with the genfscon.
>
> I don't understand how this could be racy - genfscon-assigned contexts
> should be assigned when the dentry is first instantiated via
> inode_donit_with_dentry and therefore the inode shouldn't be
> accessible to userspace prior to this initial assignment AFAIK.
> Possibly I am missing something.

I recall encountering this sporadically, but I admit that it has been a
while since I supressed it in policy. I might try to reproduce. AFAIK my
policy is the only policy that actually labels some trees on cgroup2 fs
with private types currently.

>
>>   I decided to dontaudit attempts to write to the mislabeled object and
>>   it *seems* as if systemd retries until it can write it i.e. when the
>>   object carries the expected label and so that seems to work eventually
>>   but it looks fragile.
>>
>> * The challenge with memory pressure implementation [2] is that these
>>   "memory.pressure" files end up in random locations under
>>   "/system.slice" for example:
>>
>>   /sys/fs/cgroup/system.slice/systemd-journald.service/memory.pressure
>>
>>   Where in the above systemd-journald.service might be
>>   templated (systemd-journald@FOO.service). Point is that the path is
>>   random. genfscon does not support regex and glob. I can't do for example:
>>
>>   (genfscon "cgroup2" "/system.slice/.*/memory.pressure"
>>   cgroupfile_context)
>>
>>   Fortunately cgroup2fs supports relabeling but if systemd has to
>>   manually relabel the cgroup files then I would imagine that this is
>>   racy as well, and that does not really solve the underlying issue.
>>
>>   I am looking for ideas and suggestions
>
> Optimally one of two things would happen:
> 1. The kernel would label the inode correctly when it is first created
> (e.g. by augmenting genfscon to support more general matching), or
> 2. The userspace component that creates these files would label them
> correctly at creation (via setfscreatecon() prior to creation).

Agree but 1. would require regex/glob support for genfscon and 2. these
files aren't "created" by userspace AFAIK and so setfscreatecon or
automatic object type transitions are probably not an option here.

>
> Pardon my ignorance but what creates these files initially? The kernel
> in response to some event or systemd or some other userspace
> component?

Yes AFAIK it is the former (psuedo filesystem similar to procfs, debugfs
in that sense). This is also why I don't think that the PR mentioned is
tested because cgroup2 fs labeling is done with genfscon and not fsuse
trans or fsuse xattr so even if the files would be created by
userspace (which I think is not the case) the specified automatic object
type transition rule wouldnt work.

I think eventually we currently probably have little choice but to make systemd
reset the context of said cgroup file manually. Just wanted to see if
there are alternatives.

>
>> [1] https://github.com/SELinuxProject/refpolicy/pull/607
>> [2] https://github.com/systemd/systemd/blob/main/docs/MEMORY_PRESSURE.md

-- 
gpg --locate-keys dominick.grift@defensec.nl
Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
Dominick Grift

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 13:57   ` Dominick Grift
@ 2023-03-20 14:12     ` Ondrej Mosnacek
  2023-03-20 14:19       ` Dominick Grift
  0 siblings, 1 reply; 29+ messages in thread
From: Ondrej Mosnacek @ 2023-03-20 14:12 UTC (permalink / raw)
  To: Dominick Grift; +Cc: Stephen Smalley, Paul Moore, selinux

On Mon, Mar 20, 2023 at 2:59 PM Dominick Grift
<dominick.grift@defensec.nl> wrote:
>
> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
>
> > On Mon, Mar 20, 2023 at 3:25 AM Dominick Grift
> > <dominick.grift@defensec.nl> wrote:
> >>
> >>
> >> Hi,
> >>
> >> I was reading this pull request [1] and looked into how I might be able
> >> to implement this in policy but there seem to be some technical
> >> difficulties.
> >>
> >> * I already use getfscon to seperate the systemd user.slice because the
> >>   system manager delegates the user.slice to the user manager.
> >>
> >>   (genfscon "cgroup2" "/user.slice" cgroupfile_context)
> >>
> >>   In the past the proved to be a racy where systemd attempts to
> >>   write before the object has the context associated with the genfscon.
> >
> > I don't understand how this could be racy - genfscon-assigned contexts
> > should be assigned when the dentry is first instantiated via
> > inode_donit_with_dentry and therefore the inode shouldn't be
> > accessible to userspace prior to this initial assignment AFAIK.
> > Possibly I am missing something.
>
> I recall encountering this sporadically, but I admit that it has been a
> while since I supressed it in policy. I might try to reproduce. AFAIK my
> policy is the only policy that actually labels some trees on cgroup2 fs
> with private types currently.
>
> >
> >>   I decided to dontaudit attempts to write to the mislabeled object and
> >>   it *seems* as if systemd retries until it can write it i.e. when the
> >>   object carries the expected label and so that seems to work eventually
> >>   but it looks fragile.
> >>
> >> * The challenge with memory pressure implementation [2] is that these
> >>   "memory.pressure" files end up in random locations under
> >>   "/system.slice" for example:
> >>
> >>   /sys/fs/cgroup/system.slice/systemd-journald.service/memory.pressure
> >>
> >>   Where in the above systemd-journald.service might be
> >>   templated (systemd-journald@FOO.service). Point is that the path is
> >>   random. genfscon does not support regex and glob. I can't do for example:
> >>
> >>   (genfscon "cgroup2" "/system.slice/.*/memory.pressure"
> >>   cgroupfile_context)
> >>
> >>   Fortunately cgroup2fs supports relabeling but if systemd has to
> >>   manually relabel the cgroup files then I would imagine that this is
> >>   racy as well, and that does not really solve the underlying issue.
> >>
> >>   I am looking for ideas and suggestions
> >
> > Optimally one of two things would happen:
> > 1. The kernel would label the inode correctly when it is first created
> > (e.g. by augmenting genfscon to support more general matching), or
> > 2. The userspace component that creates these files would label them
> > correctly at creation (via setfscreatecon() prior to creation).
>
> Agree but 1. would require regex/glob support for genfscon and 2. these
> files aren't "created" by userspace AFAIK and so setfscreatecon or
> automatic object type transitions are probably not an option here.
>
> >
> > Pardon my ignorance but what creates these files initially? The kernel
> > in response to some event or systemd or some other userspace
> > component?
>
> Yes AFAIK it is the former (psuedo filesystem similar to procfs, debugfs
> in that sense). This is also why I don't think that the PR mentioned is
> tested because cgroup2 fs labeling is done with genfscon and not fsuse
> trans or fsuse xattr so even if the files would be created by
> userspace (which I think is not the case) the specified automatic object
> type transition rule wouldnt work.

Actually, type transitions on cgroupfs should work - I added special
hooks for kernfs just for that some time ago - see kernel commits
d0c9c153b4bd6963c8fcccbc0caa12e8fa8d971d..e19dfdc83b60f196e0653d683499f7bc5548128f.

Not sure what's behind the genfscon label assignment race, though.

>
> I think eventually we currently probably have little choice but to make systemd
> reset the context of said cgroup file manually. Just wanted to see if
> there are alternatives.
>
> >
> >> [1] https://github.com/SELinuxProject/refpolicy/pull/607
> >> [2] https://github.com/systemd/systemd/blob/main/docs/MEMORY_PRESSURE.md
>
> --
> gpg --locate-keys dominick.grift@defensec.nl
> Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
> Dominick Grift
>


-- 
Ondrej Mosnacek
Senior Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 14:12     ` Ondrej Mosnacek
@ 2023-03-20 14:19       ` Dominick Grift
  2023-03-20 14:43         ` Dominick Grift
  2023-03-20 14:46         ` Ondrej Mosnacek
  0 siblings, 2 replies; 29+ messages in thread
From: Dominick Grift @ 2023-03-20 14:19 UTC (permalink / raw)
  To: Ondrej Mosnacek; +Cc: Stephen Smalley, Paul Moore, selinux

Ondrej Mosnacek <omosnace@redhat.com> writes:

> On Mon, Mar 20, 2023 at 2:59 PM Dominick Grift
> <dominick.grift@defensec.nl> wrote:
>>
>> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
>>
>> > On Mon, Mar 20, 2023 at 3:25 AM Dominick Grift
>> > <dominick.grift@defensec.nl> wrote:
>> >>
>> >>
>> >> Hi,
>> >>
>> >> I was reading this pull request [1] and looked into how I might be able
>> >> to implement this in policy but there seem to be some technical
>> >> difficulties.
>> >>
>> >> * I already use getfscon to seperate the systemd user.slice because the
>> >>   system manager delegates the user.slice to the user manager.
>> >>
>> >>   (genfscon "cgroup2" "/user.slice" cgroupfile_context)
>> >>
>> >>   In the past the proved to be a racy where systemd attempts to
>> >>   write before the object has the context associated with the genfscon.
>> >
>> > I don't understand how this could be racy - genfscon-assigned contexts
>> > should be assigned when the dentry is first instantiated via
>> > inode_donit_with_dentry and therefore the inode shouldn't be
>> > accessible to userspace prior to this initial assignment AFAIK.
>> > Possibly I am missing something.
>>
>> I recall encountering this sporadically, but I admit that it has been a
>> while since I supressed it in policy. I might try to reproduce. AFAIK my
>> policy is the only policy that actually labels some trees on cgroup2 fs
>> with private types currently.
>>
>> >
>> >>   I decided to dontaudit attempts to write to the mislabeled object and
>> >>   it *seems* as if systemd retries until it can write it i.e. when the
>> >>   object carries the expected label and so that seems to work eventually
>> >>   but it looks fragile.
>> >>
>> >> * The challenge with memory pressure implementation [2] is that these
>> >>   "memory.pressure" files end up in random locations under
>> >>   "/system.slice" for example:
>> >>
>> >>   /sys/fs/cgroup/system.slice/systemd-journald.service/memory.pressure
>> >>
>> >>   Where in the above systemd-journald.service might be
>> >>   templated (systemd-journald@FOO.service). Point is that the path is
>> >>   random. genfscon does not support regex and glob. I can't do for example:
>> >>
>> >>   (genfscon "cgroup2" "/system.slice/.*/memory.pressure"
>> >>   cgroupfile_context)
>> >>
>> >>   Fortunately cgroup2fs supports relabeling but if systemd has to
>> >>   manually relabel the cgroup files then I would imagine that this is
>> >>   racy as well, and that does not really solve the underlying issue.
>> >>
>> >>   I am looking for ideas and suggestions
>> >
>> > Optimally one of two things would happen:
>> > 1. The kernel would label the inode correctly when it is first created
>> > (e.g. by augmenting genfscon to support more general matching), or
>> > 2. The userspace component that creates these files would label them
>> > correctly at creation (via setfscreatecon() prior to creation).
>>
>> Agree but 1. would require regex/glob support for genfscon and 2. these
>> files aren't "created" by userspace AFAIK and so setfscreatecon or
>> automatic object type transitions are probably not an option here.
>>
>> >
>> > Pardon my ignorance but what creates these files initially? The kernel
>> > in response to some event or systemd or some other userspace
>> > component?
>>
>> Yes AFAIK it is the former (psuedo filesystem similar to procfs, debugfs
>> in that sense). This is also why I don't think that the PR mentioned is
>> tested because cgroup2 fs labeling is done with genfscon and not fsuse
>> trans or fsuse xattr so even if the files would be created by
>> userspace (which I think is not the case) the specified automatic object
>> type transition rule wouldnt work.
>
> Actually, type transitions on cgroupfs should work - I added special
> hooks for kernfs just for that some time ago - see kernel commits
> d0c9c153b4bd6963c8fcccbc0caa12e8fa8d971d..e19dfdc83b60f196e0653d683499f7bc5548128f.

Interesting. I will try this out. Would this not require at least a
"fsuse trans" statement in policy?

https://github.com/SELinuxProject/refpolicy/blob/master/policy/modules/kernel/filesystem.te#L89

Also I am not sure if that support would make much sense on a filesystem
where files are created my the kernel in reaction to some event.

>
> Not sure what's behind the genfscon label assignment race, though.
>
>>
>> I think eventually we currently probably have little choice but to make systemd
>> reset the context of said cgroup file manually. Just wanted to see if
>> there are alternatives.
>>
>> >
>> >> [1] https://github.com/SELinuxProject/refpolicy/pull/607
>> >> [2] https://github.com/systemd/systemd/blob/main/docs/MEMORY_PRESSURE.md
>>
>> --
>> gpg --locate-keys dominick.grift@defensec.nl
>> Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
>> Dominick Grift
>>

-- 
gpg --locate-keys dominick.grift@defensec.nl
Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
Dominick Grift

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 14:19       ` Dominick Grift
@ 2023-03-20 14:43         ` Dominick Grift
  2023-03-20 14:46         ` Ondrej Mosnacek
  1 sibling, 0 replies; 29+ messages in thread
From: Dominick Grift @ 2023-03-20 14:43 UTC (permalink / raw)
  To: Ondrej Mosnacek; +Cc: Stephen Smalley, Paul Moore, selinux

Dominick Grift <dominick.grift@defensec.nl> writes:

> Ondrej Mosnacek <omosnace@redhat.com> writes:
>
>> On Mon, Mar 20, 2023 at 2:59 PM Dominick Grift
>> <dominick.grift@defensec.nl> wrote:
>>>
>>> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
>>>
>>> > On Mon, Mar 20, 2023 at 3:25 AM Dominick Grift
>>> > <dominick.grift@defensec.nl> wrote:
>>> >>
>>> >>
>>> >> Hi,
>>> >>
>>> >> I was reading this pull request [1] and looked into how I might be able
>>> >> to implement this in policy but there seem to be some technical
>>> >> difficulties.
>>> >>
>>> >> * I already use getfscon to seperate the systemd user.slice because the
>>> >>   system manager delegates the user.slice to the user manager.
>>> >>
>>> >>   (genfscon "cgroup2" "/user.slice" cgroupfile_context)
>>> >>
>>> >>   In the past the proved to be a racy where systemd attempts to
>>> >>   write before the object has the context associated with the genfscon.
>>> >
>>> > I don't understand how this could be racy - genfscon-assigned contexts
>>> > should be assigned when the dentry is first instantiated via
>>> > inode_donit_with_dentry and therefore the inode shouldn't be
>>> > accessible to userspace prior to this initial assignment AFAIK.
>>> > Possibly I am missing something.
>>>
>>> I recall encountering this sporadically, but I admit that it has been a
>>> while since I supressed it in policy. I might try to reproduce. AFAIK my
>>> policy is the only policy that actually labels some trees on cgroup2 fs
>>> with private types currently.
>>>
>>> >
>>> >>   I decided to dontaudit attempts to write to the mislabeled object and
>>> >>   it *seems* as if systemd retries until it can write it i.e. when the
>>> >>   object carries the expected label and so that seems to work eventually
>>> >>   but it looks fragile.
>>> >>
>>> >> * The challenge with memory pressure implementation [2] is that these
>>> >>   "memory.pressure" files end up in random locations under
>>> >>   "/system.slice" for example:
>>> >>
>>> >>   /sys/fs/cgroup/system.slice/systemd-journald.service/memory.pressure
>>> >>
>>> >>   Where in the above systemd-journald.service might be
>>> >>   templated (systemd-journald@FOO.service). Point is that the path is
>>> >>   random. genfscon does not support regex and glob. I can't do for example:
>>> >>
>>> >>   (genfscon "cgroup2" "/system.slice/.*/memory.pressure"
>>> >>   cgroupfile_context)
>>> >>
>>> >>   Fortunately cgroup2fs supports relabeling but if systemd has to
>>> >>   manually relabel the cgroup files then I would imagine that this is
>>> >>   racy as well, and that does not really solve the underlying issue.
>>> >>
>>> >>   I am looking for ideas and suggestions
>>> >
>>> > Optimally one of two things would happen:
>>> > 1. The kernel would label the inode correctly when it is first created
>>> > (e.g. by augmenting genfscon to support more general matching), or
>>> > 2. The userspace component that creates these files would label them
>>> > correctly at creation (via setfscreatecon() prior to creation).
>>>
>>> Agree but 1. would require regex/glob support for genfscon and 2. these
>>> files aren't "created" by userspace AFAIK and so setfscreatecon or
>>> automatic object type transitions are probably not an option here.
>>>
>>> >
>>> > Pardon my ignorance but what creates these files initially? The kernel
>>> > in response to some event or systemd or some other userspace
>>> > component?
>>>
>>> Yes AFAIK it is the former (psuedo filesystem similar to procfs, debugfs
>>> in that sense). This is also why I don't think that the PR mentioned is
>>> tested because cgroup2 fs labeling is done with genfscon and not fsuse
>>> trans or fsuse xattr so even if the files would be created by
>>> userspace (which I think is not the case) the specified automatic object
>>> type transition rule wouldnt work.
>>
>> Actually, type transitions on cgroupfs should work - I added special
>> hooks for kernfs just for that some time ago - see kernel commits
>> d0c9c153b4bd6963c8fcccbc0caa12e8fa8d971d..e19dfdc83b60f196e0653d683499f7bc5548128f.
>
> Interesting. I will try this out. Would this not require at least a
> "fsuse trans" statement in policy?
>
> https://github.com/SELinuxProject/refpolicy/blob/master/policy/modules/kernel/filesystem.te#L89
>
> Also I am not sure if that support would make much sense on a filesystem
> where files are created my the kernel in reaction to some event.

I tried this out:

1. you can't create files on cgroupfs so adding support for transitions
does not seem to make a whole lot of sense to me

2. you can add a `fsuse trans "cgroup2"` statement in policy instead of a
`genfscon "cgroup2"` statement but it does not make sense as you cannot
create files on there anyway.

3. the PR mentioned is probably untested because type transitions do not work

>
>>
>> Not sure what's behind the genfscon label assignment race, though.
>>
>>>
>>> I think eventually we currently probably have little choice but to make systemd
>>> reset the context of said cgroup file manually. Just wanted to see if
>>> there are alternatives.
>>>
>>> >
>>> >> [1] https://github.com/SELinuxProject/refpolicy/pull/607
>>> >> [2] https://github.com/systemd/systemd/blob/main/docs/MEMORY_PRESSURE.md
>>>
>>> --
>>> gpg --locate-keys dominick.grift@defensec.nl
>>> Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
>>> Dominick Grift
>>>

-- 
gpg --locate-keys dominick.grift@defensec.nl
Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
Dominick Grift

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 14:19       ` Dominick Grift
  2023-03-20 14:43         ` Dominick Grift
@ 2023-03-20 14:46         ` Ondrej Mosnacek
  2023-03-20 15:16           ` Stephen Smalley
  1 sibling, 1 reply; 29+ messages in thread
From: Ondrej Mosnacek @ 2023-03-20 14:46 UTC (permalink / raw)
  To: Dominick Grift; +Cc: Stephen Smalley, Paul Moore, selinux

On Mon, Mar 20, 2023 at 3:19 PM Dominick Grift
<dominick.grift@defensec.nl> wrote:
>
> Ondrej Mosnacek <omosnace@redhat.com> writes:
>
> > On Mon, Mar 20, 2023 at 2:59 PM Dominick Grift
> > <dominick.grift@defensec.nl> wrote:
> >>
> >> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
> >>
> >> > On Mon, Mar 20, 2023 at 3:25 AM Dominick Grift
> >> > <dominick.grift@defensec.nl> wrote:
> >> >>
> >> >>
> >> >> Hi,
> >> >>
> >> >> I was reading this pull request [1] and looked into how I might be able
> >> >> to implement this in policy but there seem to be some technical
> >> >> difficulties.
> >> >>
> >> >> * I already use getfscon to seperate the systemd user.slice because the
> >> >>   system manager delegates the user.slice to the user manager.
> >> >>
> >> >>   (genfscon "cgroup2" "/user.slice" cgroupfile_context)
> >> >>
> >> >>   In the past the proved to be a racy where systemd attempts to
> >> >>   write before the object has the context associated with the genfscon.
> >> >
> >> > I don't understand how this could be racy - genfscon-assigned contexts
> >> > should be assigned when the dentry is first instantiated via
> >> > inode_donit_with_dentry and therefore the inode shouldn't be
> >> > accessible to userspace prior to this initial assignment AFAIK.
> >> > Possibly I am missing something.
> >>
> >> I recall encountering this sporadically, but I admit that it has been a
> >> while since I supressed it in policy. I might try to reproduce. AFAIK my
> >> policy is the only policy that actually labels some trees on cgroup2 fs
> >> with private types currently.
> >>
> >> >
> >> >>   I decided to dontaudit attempts to write to the mislabeled object and
> >> >>   it *seems* as if systemd retries until it can write it i.e. when the
> >> >>   object carries the expected label and so that seems to work eventually
> >> >>   but it looks fragile.
> >> >>
> >> >> * The challenge with memory pressure implementation [2] is that these
> >> >>   "memory.pressure" files end up in random locations under
> >> >>   "/system.slice" for example:
> >> >>
> >> >>   /sys/fs/cgroup/system.slice/systemd-journald.service/memory.pressure
> >> >>
> >> >>   Where in the above systemd-journald.service might be
> >> >>   templated (systemd-journald@FOO.service). Point is that the path is
> >> >>   random. genfscon does not support regex and glob. I can't do for example:
> >> >>
> >> >>   (genfscon "cgroup2" "/system.slice/.*/memory.pressure"
> >> >>   cgroupfile_context)
> >> >>
> >> >>   Fortunately cgroup2fs supports relabeling but if systemd has to
> >> >>   manually relabel the cgroup files then I would imagine that this is
> >> >>   racy as well, and that does not really solve the underlying issue.
> >> >>
> >> >>   I am looking for ideas and suggestions
> >> >
> >> > Optimally one of two things would happen:
> >> > 1. The kernel would label the inode correctly when it is first created
> >> > (e.g. by augmenting genfscon to support more general matching), or
> >> > 2. The userspace component that creates these files would label them
> >> > correctly at creation (via setfscreatecon() prior to creation).
> >>
> >> Agree but 1. would require regex/glob support for genfscon and 2. these
> >> files aren't "created" by userspace AFAIK and so setfscreatecon or
> >> automatic object type transitions are probably not an option here.
> >>
> >> >
> >> > Pardon my ignorance but what creates these files initially? The kernel
> >> > in response to some event or systemd or some other userspace
> >> > component?
> >>
> >> Yes AFAIK it is the former (psuedo filesystem similar to procfs, debugfs
> >> in that sense). This is also why I don't think that the PR mentioned is
> >> tested because cgroup2 fs labeling is done with genfscon and not fsuse
> >> trans or fsuse xattr so even if the files would be created by
> >> userspace (which I think is not the case) the specified automatic object
> >> type transition rule wouldnt work.
> >
> > Actually, type transitions on cgroupfs should work - I added special
> > hooks for kernfs just for that some time ago - see kernel commits
> > d0c9c153b4bd6963c8fcccbc0caa12e8fa8d971d..e19dfdc83b60f196e0653d683499f7bc5548128f.
>
> Interesting. I will try this out. Would this not require at least a
> "fsuse trans" statement in policy?

No, it should work alongside genfscon. cgroupfs already was special
before that as it allowed relabeling despite genfscon being used.

>
> https://github.com/SELinuxProject/refpolicy/blob/master/policy/modules/kernel/filesystem.te#L89
>
> Also I am not sure if that support would make much sense on a filesystem
> where files are created my the kernel in reaction to some event.

It does make sense with named transitions, plus it was needed to make
even a simple parent-child inheritance work. Also, I believe some
cgroupfs files/directories (I think only directories?) can be created
by userspace, too.

-- 
Ondrej Mosnacek
Senior Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 14:46         ` Ondrej Mosnacek
@ 2023-03-20 15:16           ` Stephen Smalley
  2023-03-20 15:23             ` Dominick Grift
  0 siblings, 1 reply; 29+ messages in thread
From: Stephen Smalley @ 2023-03-20 15:16 UTC (permalink / raw)
  To: Ondrej Mosnacek; +Cc: Dominick Grift, Paul Moore, selinux

On Mon, Mar 20, 2023 at 10:46 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
>
> On Mon, Mar 20, 2023 at 3:19 PM Dominick Grift
> <dominick.grift@defensec.nl> wrote:
> >
> > Ondrej Mosnacek <omosnace@redhat.com> writes:
> >
> > > On Mon, Mar 20, 2023 at 2:59 PM Dominick Grift
> > > <dominick.grift@defensec.nl> wrote:
> > >>
> > >> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
> > >>
> > >> > On Mon, Mar 20, 2023 at 3:25 AM Dominick Grift
> > >> > <dominick.grift@defensec.nl> wrote:
> > >> >>
> > >> >>
> > >> >> Hi,
> > >> >>
> > >> >> I was reading this pull request [1] and looked into how I might be able
> > >> >> to implement this in policy but there seem to be some technical
> > >> >> difficulties.
> > >> >>
> > >> >> * I already use getfscon to seperate the systemd user.slice because the
> > >> >>   system manager delegates the user.slice to the user manager.
> > >> >>
> > >> >>   (genfscon "cgroup2" "/user.slice" cgroupfile_context)
> > >> >>
> > >> >>   In the past the proved to be a racy where systemd attempts to
> > >> >>   write before the object has the context associated with the genfscon.
> > >> >
> > >> > I don't understand how this could be racy - genfscon-assigned contexts
> > >> > should be assigned when the dentry is first instantiated via
> > >> > inode_donit_with_dentry and therefore the inode shouldn't be
> > >> > accessible to userspace prior to this initial assignment AFAIK.
> > >> > Possibly I am missing something.
> > >>
> > >> I recall encountering this sporadically, but I admit that it has been a
> > >> while since I supressed it in policy. I might try to reproduce. AFAIK my
> > >> policy is the only policy that actually labels some trees on cgroup2 fs
> > >> with private types currently.
> > >>
> > >> >
> > >> >>   I decided to dontaudit attempts to write to the mislabeled object and
> > >> >>   it *seems* as if systemd retries until it can write it i.e. when the
> > >> >>   object carries the expected label and so that seems to work eventually
> > >> >>   but it looks fragile.
> > >> >>
> > >> >> * The challenge with memory pressure implementation [2] is that these
> > >> >>   "memory.pressure" files end up in random locations under
> > >> >>   "/system.slice" for example:
> > >> >>
> > >> >>   /sys/fs/cgroup/system.slice/systemd-journald.service/memory.pressure
> > >> >>
> > >> >>   Where in the above systemd-journald.service might be
> > >> >>   templated (systemd-journald@FOO.service). Point is that the path is
> > >> >>   random. genfscon does not support regex and glob. I can't do for example:
> > >> >>
> > >> >>   (genfscon "cgroup2" "/system.slice/.*/memory.pressure"
> > >> >>   cgroupfile_context)
> > >> >>
> > >> >>   Fortunately cgroup2fs supports relabeling but if systemd has to
> > >> >>   manually relabel the cgroup files then I would imagine that this is
> > >> >>   racy as well, and that does not really solve the underlying issue.
> > >> >>
> > >> >>   I am looking for ideas and suggestions
> > >> >
> > >> > Optimally one of two things would happen:
> > >> > 1. The kernel would label the inode correctly when it is first created
> > >> > (e.g. by augmenting genfscon to support more general matching), or
> > >> > 2. The userspace component that creates these files would label them
> > >> > correctly at creation (via setfscreatecon() prior to creation).
> > >>
> > >> Agree but 1. would require regex/glob support for genfscon and 2. these
> > >> files aren't "created" by userspace AFAIK and so setfscreatecon or
> > >> automatic object type transitions are probably not an option here.
> > >>
> > >> >
> > >> > Pardon my ignorance but what creates these files initially? The kernel
> > >> > in response to some event or systemd or some other userspace
> > >> > component?
> > >>
> > >> Yes AFAIK it is the former (psuedo filesystem similar to procfs, debugfs
> > >> in that sense). This is also why I don't think that the PR mentioned is
> > >> tested because cgroup2 fs labeling is done with genfscon and not fsuse
> > >> trans or fsuse xattr so even if the files would be created by
> > >> userspace (which I think is not the case) the specified automatic object
> > >> type transition rule wouldnt work.
> > >
> > > Actually, type transitions on cgroupfs should work - I added special
> > > hooks for kernfs just for that some time ago - see kernel commits
> > > d0c9c153b4bd6963c8fcccbc0caa12e8fa8d971d..e19dfdc83b60f196e0653d683499f7bc5548128f.
> >
> > Interesting. I will try this out. Would this not require at least a
> > "fsuse trans" statement in policy?
>
> No, it should work alongside genfscon. cgroupfs already was special
> before that as it allowed relabeling despite genfscon being used.
>
> >
> > https://github.com/SELinuxProject/refpolicy/blob/master/policy/modules/kernel/filesystem.te#L89
> >
> > Also I am not sure if that support would make much sense on a filesystem
> > where files are created my the kernel in reaction to some event.
>
> It does make sense with named transitions, plus it was needed to make
> even a simple parent-child inheritance work. Also, I believe some
> cgroupfs files/directories (I think only directories?) can be created
> by userspace, too.

We should likely check that the SELinux Notebook and/or other
documentation reflects this support and which filesystem types are
supported, both wrt the filesystem types that support both genfscon +
setxattr and those that support genfscon+setxattr+type_transition
rules.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 15:16           ` Stephen Smalley
@ 2023-03-20 15:23             ` Dominick Grift
  2023-03-20 16:32               ` Stephen Smalley
  0 siblings, 1 reply; 29+ messages in thread
From: Dominick Grift @ 2023-03-20 15:23 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: Ondrej Mosnacek, Paul Moore, selinux

Stephen Smalley <stephen.smalley.work@gmail.com> writes:

> On Mon, Mar 20, 2023 at 10:46 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
>>
>> On Mon, Mar 20, 2023 at 3:19 PM Dominick Grift
>> <dominick.grift@defensec.nl> wrote:
>> >
>> > Ondrej Mosnacek <omosnace@redhat.com> writes:
>> >
>> > > On Mon, Mar 20, 2023 at 2:59 PM Dominick Grift
>> > > <dominick.grift@defensec.nl> wrote:
>> > >>
>> > >> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
>> > >>
>> > >> > On Mon, Mar 20, 2023 at 3:25 AM Dominick Grift
>> > >> > <dominick.grift@defensec.nl> wrote:
>> > >> >>
>> > >> >>
>> > >> >> Hi,
>> > >> >>
>> > >> >> I was reading this pull request [1] and looked into how I might be able
>> > >> >> to implement this in policy but there seem to be some technical
>> > >> >> difficulties.
>> > >> >>
>> > >> >> * I already use getfscon to seperate the systemd user.slice because the
>> > >> >>   system manager delegates the user.slice to the user manager.
>> > >> >>
>> > >> >>   (genfscon "cgroup2" "/user.slice" cgroupfile_context)
>> > >> >>
>> > >> >>   In the past the proved to be a racy where systemd attempts to
>> > >> >>   write before the object has the context associated with the genfscon.
>> > >> >
>> > >> > I don't understand how this could be racy - genfscon-assigned contexts
>> > >> > should be assigned when the dentry is first instantiated via
>> > >> > inode_donit_with_dentry and therefore the inode shouldn't be
>> > >> > accessible to userspace prior to this initial assignment AFAIK.
>> > >> > Possibly I am missing something.
>> > >>
>> > >> I recall encountering this sporadically, but I admit that it has been a
>> > >> while since I supressed it in policy. I might try to reproduce. AFAIK my
>> > >> policy is the only policy that actually labels some trees on cgroup2 fs
>> > >> with private types currently.
>> > >>
>> > >> >
>> > >> >>   I decided to dontaudit attempts to write to the mislabeled object and
>> > >> >>   it *seems* as if systemd retries until it can write it i.e. when the
>> > >> >>   object carries the expected label and so that seems to work eventually
>> > >> >>   but it looks fragile.
>> > >> >>
>> > >> >> * The challenge with memory pressure implementation [2] is that these
>> > >> >>   "memory.pressure" files end up in random locations under
>> > >> >>   "/system.slice" for example:
>> > >> >>
>> > >> >>   /sys/fs/cgroup/system.slice/systemd-journald.service/memory.pressure
>> > >> >>
>> > >> >>   Where in the above systemd-journald.service might be
>> > >> >>   templated (systemd-journald@FOO.service). Point is that the path is
>> > >> >>   random. genfscon does not support regex and glob. I can't do for example:
>> > >> >>
>> > >> >>   (genfscon "cgroup2" "/system.slice/.*/memory.pressure"
>> > >> >>   cgroupfile_context)
>> > >> >>
>> > >> >>   Fortunately cgroup2fs supports relabeling but if systemd has to
>> > >> >>   manually relabel the cgroup files then I would imagine that this is
>> > >> >>   racy as well, and that does not really solve the underlying issue.
>> > >> >>
>> > >> >>   I am looking for ideas and suggestions
>> > >> >
>> > >> > Optimally one of two things would happen:
>> > >> > 1. The kernel would label the inode correctly when it is first created
>> > >> > (e.g. by augmenting genfscon to support more general matching), or
>> > >> > 2. The userspace component that creates these files would label them
>> > >> > correctly at creation (via setfscreatecon() prior to creation).
>> > >>
>> > >> Agree but 1. would require regex/glob support for genfscon and 2. these
>> > >> files aren't "created" by userspace AFAIK and so setfscreatecon or
>> > >> automatic object type transitions are probably not an option here.
>> > >>
>> > >> >
>> > >> > Pardon my ignorance but what creates these files initially? The kernel
>> > >> > in response to some event or systemd or some other userspace
>> > >> > component?
>> > >>
>> > >> Yes AFAIK it is the former (psuedo filesystem similar to procfs, debugfs
>> > >> in that sense). This is also why I don't think that the PR mentioned is
>> > >> tested because cgroup2 fs labeling is done with genfscon and not fsuse
>> > >> trans or fsuse xattr so even if the files would be created by
>> > >> userspace (which I think is not the case) the specified automatic object
>> > >> type transition rule wouldnt work.
>> > >
>> > > Actually, type transitions on cgroupfs should work - I added special
>> > > hooks for kernfs just for that some time ago - see kernel commits
>> > > d0c9c153b4bd6963c8fcccbc0caa12e8fa8d971d..e19dfdc83b60f196e0653d683499f7bc5548128f.
>> >
>> > Interesting. I will try this out. Would this not require at least a
>> > "fsuse trans" statement in policy?
>>
>> No, it should work alongside genfscon. cgroupfs already was special
>> before that as it allowed relabeling despite genfscon being used.
>>
>> >
>> > https://github.com/SELinuxProject/refpolicy/blob/master/policy/modules/kernel/filesystem.te#L89
>> >
>> > Also I am not sure if that support would make much sense on a filesystem
>> > where files are created my the kernel in reaction to some event.
>>
>> It does make sense with named transitions, plus it was needed to make
>> even a simple parent-child inheritance work. Also, I believe some
>> cgroupfs files/directories (I think only directories?) can be created
>> by userspace, too.
>
> We should likely check that the SELinux Notebook and/or other
> documentation reflects this support and which filesystem types are
> supported, both wrt the filesystem types that support both genfscon +
> setxattr and those that support genfscon+setxattr+type_transition
> rules.

I tried this out:

1. yes you can create dirs on cgroup2 fs (but not files)
2. you can have a genfscon "cgroup2" alongside fsuse trans "cgroup2" but
if you do then any genfscon statements you might have like for example
genfscon "cgroup2" "/user.slice" cgroupfile_context) no longer
work. i.e. its pointless to have then both
3. even with a fsuse trans statement I could not make type transitions
work for directories created on cgroup2 fs.

Even if you could create directories on a cgroupfs with a type
transition, and if the files under that directory would inherited the
type of the parent, then that still would not be good enough to address
the memory.pressure file challenge because the point is to allow a
service to write the memory.pressure file but not other files in that
same directory.

-- 
gpg --locate-keys dominick.grift@defensec.nl
Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
Dominick Grift

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 15:23             ` Dominick Grift
@ 2023-03-20 16:32               ` Stephen Smalley
  2023-03-20 16:37                 ` Dominick Grift
  0 siblings, 1 reply; 29+ messages in thread
From: Stephen Smalley @ 2023-03-20 16:32 UTC (permalink / raw)
  To: Dominick Grift; +Cc: Ondrej Mosnacek, Paul Moore, selinux

On Mon, Mar 20, 2023 at 11:23 AM Dominick Grift
<dominick.grift@defensec.nl> wrote:
>
> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
>
> > On Mon, Mar 20, 2023 at 10:46 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> >>
> >> On Mon, Mar 20, 2023 at 3:19 PM Dominick Grift
> >> <dominick.grift@defensec.nl> wrote:
> >> >
> >> > Ondrej Mosnacek <omosnace@redhat.com> writes:
> >> >
> >> > > On Mon, Mar 20, 2023 at 2:59 PM Dominick Grift
> >> > > <dominick.grift@defensec.nl> wrote:
> >> > >>
> >> > >> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
> >> > >>
> >> > >> > On Mon, Mar 20, 2023 at 3:25 AM Dominick Grift
> >> > >> > <dominick.grift@defensec.nl> wrote:
> >> > >> >>
> >> > >> >>
> >> > >> >> Hi,
> >> > >> >>
> >> > >> >> I was reading this pull request [1] and looked into how I might be able
> >> > >> >> to implement this in policy but there seem to be some technical
> >> > >> >> difficulties.
> >> > >> >>
> >> > >> >> * I already use getfscon to seperate the systemd user.slice because the
> >> > >> >>   system manager delegates the user.slice to the user manager.
> >> > >> >>
> >> > >> >>   (genfscon "cgroup2" "/user.slice" cgroupfile_context)
> >> > >> >>
> >> > >> >>   In the past the proved to be a racy where systemd attempts to
> >> > >> >>   write before the object has the context associated with the genfscon.
> >> > >> >
> >> > >> > I don't understand how this could be racy - genfscon-assigned contexts
> >> > >> > should be assigned when the dentry is first instantiated via
> >> > >> > inode_donit_with_dentry and therefore the inode shouldn't be
> >> > >> > accessible to userspace prior to this initial assignment AFAIK.
> >> > >> > Possibly I am missing something.
> >> > >>
> >> > >> I recall encountering this sporadically, but I admit that it has been a
> >> > >> while since I supressed it in policy. I might try to reproduce. AFAIK my
> >> > >> policy is the only policy that actually labels some trees on cgroup2 fs
> >> > >> with private types currently.
> >> > >>
> >> > >> >
> >> > >> >>   I decided to dontaudit attempts to write to the mislabeled object and
> >> > >> >>   it *seems* as if systemd retries until it can write it i.e. when the
> >> > >> >>   object carries the expected label and so that seems to work eventually
> >> > >> >>   but it looks fragile.
> >> > >> >>
> >> > >> >> * The challenge with memory pressure implementation [2] is that these
> >> > >> >>   "memory.pressure" files end up in random locations under
> >> > >> >>   "/system.slice" for example:
> >> > >> >>
> >> > >> >>   /sys/fs/cgroup/system.slice/systemd-journald.service/memory.pressure
> >> > >> >>
> >> > >> >>   Where in the above systemd-journald.service might be
> >> > >> >>   templated (systemd-journald@FOO.service). Point is that the path is
> >> > >> >>   random. genfscon does not support regex and glob. I can't do for example:
> >> > >> >>
> >> > >> >>   (genfscon "cgroup2" "/system.slice/.*/memory.pressure"
> >> > >> >>   cgroupfile_context)
> >> > >> >>
> >> > >> >>   Fortunately cgroup2fs supports relabeling but if systemd has to
> >> > >> >>   manually relabel the cgroup files then I would imagine that this is
> >> > >> >>   racy as well, and that does not really solve the underlying issue.
> >> > >> >>
> >> > >> >>   I am looking for ideas and suggestions
> >> > >> >
> >> > >> > Optimally one of two things would happen:
> >> > >> > 1. The kernel would label the inode correctly when it is first created
> >> > >> > (e.g. by augmenting genfscon to support more general matching), or
> >> > >> > 2. The userspace component that creates these files would label them
> >> > >> > correctly at creation (via setfscreatecon() prior to creation).
> >> > >>
> >> > >> Agree but 1. would require regex/glob support for genfscon and 2. these
> >> > >> files aren't "created" by userspace AFAIK and so setfscreatecon or
> >> > >> automatic object type transitions are probably not an option here.
> >> > >>
> >> > >> >
> >> > >> > Pardon my ignorance but what creates these files initially? The kernel
> >> > >> > in response to some event or systemd or some other userspace
> >> > >> > component?
> >> > >>
> >> > >> Yes AFAIK it is the former (psuedo filesystem similar to procfs, debugfs
> >> > >> in that sense). This is also why I don't think that the PR mentioned is
> >> > >> tested because cgroup2 fs labeling is done with genfscon and not fsuse
> >> > >> trans or fsuse xattr so even if the files would be created by
> >> > >> userspace (which I think is not the case) the specified automatic object
> >> > >> type transition rule wouldnt work.
> >> > >
> >> > > Actually, type transitions on cgroupfs should work - I added special
> >> > > hooks for kernfs just for that some time ago - see kernel commits
> >> > > d0c9c153b4bd6963c8fcccbc0caa12e8fa8d971d..e19dfdc83b60f196e0653d683499f7bc5548128f.
> >> >
> >> > Interesting. I will try this out. Would this not require at least a
> >> > "fsuse trans" statement in policy?
> >>
> >> No, it should work alongside genfscon. cgroupfs already was special
> >> before that as it allowed relabeling despite genfscon being used.
> >>
> >> >
> >> > https://github.com/SELinuxProject/refpolicy/blob/master/policy/modules/kernel/filesystem.te#L89
> >> >
> >> > Also I am not sure if that support would make much sense on a filesystem
> >> > where files are created my the kernel in reaction to some event.
> >>
> >> It does make sense with named transitions, plus it was needed to make
> >> even a simple parent-child inheritance work. Also, I believe some
> >> cgroupfs files/directories (I think only directories?) can be created
> >> by userspace, too.
> >
> > We should likely check that the SELinux Notebook and/or other
> > documentation reflects this support and which filesystem types are
> > supported, both wrt the filesystem types that support both genfscon +
> > setxattr and those that support genfscon+setxattr+type_transition
> > rules.
>
> I tried this out:
>
> 1. yes you can create dirs on cgroup2 fs (but not files)
> 2. you can have a genfscon "cgroup2" alongside fsuse trans "cgroup2" but
> if you do then any genfscon statements you might have like for example
> genfscon "cgroup2" "/user.slice" cgroupfile_context) no longer
> work. i.e. its pointless to have then both
> 3. even with a fsuse trans statement I could not make type transitions
> work for directories created on cgroup2 fs.
>
> Even if you could create directories on a cgroupfs with a type
> transition, and if the files under that directory would inherited the
> type of the parent, then that still would not be good enough to address
> the memory.pressure file challenge because the point is to allow a
> service to write the memory.pressure file but not other files in that
> same directory.

You don't want a fs_use_trans statement in your policy for cgroup2.
Just genfscon statements. The kernel will still check for
type_transition rules and apply them to files at creation time without
having a fs_use_trans, but having a fs_use_trans will override
genfscon.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 16:32               ` Stephen Smalley
@ 2023-03-20 16:37                 ` Dominick Grift
  2023-03-20 17:28                   ` Stephen Smalley
  0 siblings, 1 reply; 29+ messages in thread
From: Dominick Grift @ 2023-03-20 16:37 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: Ondrej Mosnacek, Paul Moore, selinux

Stephen Smalley <stephen.smalley.work@gmail.com> writes:

> On Mon, Mar 20, 2023 at 11:23 AM Dominick Grift
> <dominick.grift@defensec.nl> wrote:
>>
>> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
>>
>> > On Mon, Mar 20, 2023 at 10:46 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
>> >>
>> >> On Mon, Mar 20, 2023 at 3:19 PM Dominick Grift
>> >> <dominick.grift@defensec.nl> wrote:
>> >> >
>> >> > Ondrej Mosnacek <omosnace@redhat.com> writes:
>> >> >
>> >> > > On Mon, Mar 20, 2023 at 2:59 PM Dominick Grift
>> >> > > <dominick.grift@defensec.nl> wrote:
>> >> > >>
>> >> > >> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
>> >> > >>
>> >> > >> > On Mon, Mar 20, 2023 at 3:25 AM Dominick Grift
>> >> > >> > <dominick.grift@defensec.nl> wrote:
>> >> > >> >>
>> >> > >> >>
>> >> > >> >> Hi,
>> >> > >> >>
>> >> > >> >> I was reading this pull request [1] and looked into how I might be able
>> >> > >> >> to implement this in policy but there seem to be some technical
>> >> > >> >> difficulties.
>> >> > >> >>
>> >> > >> >> * I already use getfscon to seperate the systemd user.slice because the
>> >> > >> >>   system manager delegates the user.slice to the user manager.
>> >> > >> >>
>> >> > >> >>   (genfscon "cgroup2" "/user.slice" cgroupfile_context)
>> >> > >> >>
>> >> > >> >>   In the past the proved to be a racy where systemd attempts to
>> >> > >> >>   write before the object has the context associated with the genfscon.
>> >> > >> >
>> >> > >> > I don't understand how this could be racy - genfscon-assigned contexts
>> >> > >> > should be assigned when the dentry is first instantiated via
>> >> > >> > inode_donit_with_dentry and therefore the inode shouldn't be
>> >> > >> > accessible to userspace prior to this initial assignment AFAIK.
>> >> > >> > Possibly I am missing something.
>> >> > >>
>> >> > >> I recall encountering this sporadically, but I admit that it has been a
>> >> > >> while since I supressed it in policy. I might try to reproduce. AFAIK my
>> >> > >> policy is the only policy that actually labels some trees on cgroup2 fs
>> >> > >> with private types currently.
>> >> > >>
>> >> > >> >
>> >> > >> >>   I decided to dontaudit attempts to write to the mislabeled object and
>> >> > >> >>   it *seems* as if systemd retries until it can write it i.e. when the
>> >> > >> >>   object carries the expected label and so that seems to work eventually
>> >> > >> >>   but it looks fragile.
>> >> > >> >>
>> >> > >> >> * The challenge with memory pressure implementation [2] is that these
>> >> > >> >>   "memory.pressure" files end up in random locations under
>> >> > >> >>   "/system.slice" for example:
>> >> > >> >>
>> >> > >> >>   /sys/fs/cgroup/system.slice/systemd-journald.service/memory.pressure
>> >> > >> >>
>> >> > >> >>   Where in the above systemd-journald.service might be
>> >> > >> >>   templated (systemd-journald@FOO.service). Point is that the path is
>> >> > >> >>   random. genfscon does not support regex and glob. I can't do for example:
>> >> > >> >>
>> >> > >> >>   (genfscon "cgroup2" "/system.slice/.*/memory.pressure"
>> >> > >> >>   cgroupfile_context)
>> >> > >> >>
>> >> > >> >>   Fortunately cgroup2fs supports relabeling but if systemd has to
>> >> > >> >>   manually relabel the cgroup files then I would imagine that this is
>> >> > >> >>   racy as well, and that does not really solve the underlying issue.
>> >> > >> >>
>> >> > >> >>   I am looking for ideas and suggestions
>> >> > >> >
>> >> > >> > Optimally one of two things would happen:
>> >> > >> > 1. The kernel would label the inode correctly when it is first created
>> >> > >> > (e.g. by augmenting genfscon to support more general matching), or
>> >> > >> > 2. The userspace component that creates these files would label them
>> >> > >> > correctly at creation (via setfscreatecon() prior to creation).
>> >> > >>
>> >> > >> Agree but 1. would require regex/glob support for genfscon and 2. these
>> >> > >> files aren't "created" by userspace AFAIK and so setfscreatecon or
>> >> > >> automatic object type transitions are probably not an option here.
>> >> > >>
>> >> > >> >
>> >> > >> > Pardon my ignorance but what creates these files initially? The kernel
>> >> > >> > in response to some event or systemd or some other userspace
>> >> > >> > component?
>> >> > >>
>> >> > >> Yes AFAIK it is the former (psuedo filesystem similar to procfs, debugfs
>> >> > >> in that sense). This is also why I don't think that the PR mentioned is
>> >> > >> tested because cgroup2 fs labeling is done with genfscon and not fsuse
>> >> > >> trans or fsuse xattr so even if the files would be created by
>> >> > >> userspace (which I think is not the case) the specified automatic object
>> >> > >> type transition rule wouldnt work.
>> >> > >
>> >> > > Actually, type transitions on cgroupfs should work - I added special
>> >> > > hooks for kernfs just for that some time ago - see kernel commits
>> >> > > d0c9c153b4bd6963c8fcccbc0caa12e8fa8d971d..e19dfdc83b60f196e0653d683499f7bc5548128f.
>> >> >
>> >> > Interesting. I will try this out. Would this not require at least a
>> >> > "fsuse trans" statement in policy?
>> >>
>> >> No, it should work alongside genfscon. cgroupfs already was special
>> >> before that as it allowed relabeling despite genfscon being used.
>> >>
>> >> >
>> >> > https://github.com/SELinuxProject/refpolicy/blob/master/policy/modules/kernel/filesystem.te#L89
>> >> >
>> >> > Also I am not sure if that support would make much sense on a filesystem
>> >> > where files are created my the kernel in reaction to some event.
>> >>
>> >> It does make sense with named transitions, plus it was needed to make
>> >> even a simple parent-child inheritance work. Also, I believe some
>> >> cgroupfs files/directories (I think only directories?) can be created
>> >> by userspace, too.
>> >
>> > We should likely check that the SELinux Notebook and/or other
>> > documentation reflects this support and which filesystem types are
>> > supported, both wrt the filesystem types that support both genfscon +
>> > setxattr and those that support genfscon+setxattr+type_transition
>> > rules.
>>
>> I tried this out:
>>
>> 1. yes you can create dirs on cgroup2 fs (but not files)
>> 2. you can have a genfscon "cgroup2" alongside fsuse trans "cgroup2" but
>> if you do then any genfscon statements you might have like for example
>> genfscon "cgroup2" "/user.slice" cgroupfile_context) no longer
>> work. i.e. its pointless to have then both
>> 3. even with a fsuse trans statement I could not make type transitions
>> work for directories created on cgroup2 fs.
>>
>> Even if you could create directories on a cgroupfs with a type
>> transition, and if the files under that directory would inherited the
>> type of the parent, then that still would not be good enough to address
>> the memory.pressure file challenge because the point is to allow a
>> service to write the memory.pressure file but not other files in that
>> same directory.
>
> You don't want a fs_use_trans statement in your policy for cgroup2.
> Just genfscon statements. The kernel will still check for
> type_transition rules and apply them to files at creation time without
> having a fs_use_trans, but having a fs_use_trans will override
> genfscon.

Thanks for clarification. to reiterate I could not get type_transition
to work with only genfscon either when creating a directory on cgroup2
fs but maybe I was overlooking something.

-- 
gpg --locate-keys dominick.grift@defensec.nl
Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
Dominick Grift

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 16:37                 ` Dominick Grift
@ 2023-03-20 17:28                   ` Stephen Smalley
  2023-03-20 17:53                     ` Stephen Smalley
  0 siblings, 1 reply; 29+ messages in thread
From: Stephen Smalley @ 2023-03-20 17:28 UTC (permalink / raw)
  To: Dominick Grift; +Cc: Ondrej Mosnacek, Paul Moore, selinux

On Mon, Mar 20, 2023 at 12:37 PM Dominick Grift
<dominick.grift@defensec.nl> wrote:
>
> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
>
> > On Mon, Mar 20, 2023 at 11:23 AM Dominick Grift
> > <dominick.grift@defensec.nl> wrote:
> >>
> >> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
> >>
> >> > On Mon, Mar 20, 2023 at 10:46 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> >> >>
> >> >> On Mon, Mar 20, 2023 at 3:19 PM Dominick Grift
> >> >> <dominick.grift@defensec.nl> wrote:
> >> >> >
> >> >> > Ondrej Mosnacek <omosnace@redhat.com> writes:
> >> >> >
> >> >> > > On Mon, Mar 20, 2023 at 2:59 PM Dominick Grift
> >> >> > > <dominick.grift@defensec.nl> wrote:
> >> >> > >>
> >> >> > >> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
> >> >> > >>
> >> >> > >> > On Mon, Mar 20, 2023 at 3:25 AM Dominick Grift
> >> >> > >> > <dominick.grift@defensec.nl> wrote:
> >> >> > >> >>
> >> >> > >> >>
> >> >> > >> >> Hi,
> >> >> > >> >>
> >> >> > >> >> I was reading this pull request [1] and looked into how I might be able
> >> >> > >> >> to implement this in policy but there seem to be some technical
> >> >> > >> >> difficulties.
> >> >> > >> >>
> >> >> > >> >> * I already use getfscon to seperate the systemd user.slice because the
> >> >> > >> >>   system manager delegates the user.slice to the user manager.
> >> >> > >> >>
> >> >> > >> >>   (genfscon "cgroup2" "/user.slice" cgroupfile_context)
> >> >> > >> >>
> >> >> > >> >>   In the past the proved to be a racy where systemd attempts to
> >> >> > >> >>   write before the object has the context associated with the genfscon.
> >> >> > >> >
> >> >> > >> > I don't understand how this could be racy - genfscon-assigned contexts
> >> >> > >> > should be assigned when the dentry is first instantiated via
> >> >> > >> > inode_donit_with_dentry and therefore the inode shouldn't be
> >> >> > >> > accessible to userspace prior to this initial assignment AFAIK.
> >> >> > >> > Possibly I am missing something.
> >> >> > >>
> >> >> > >> I recall encountering this sporadically, but I admit that it has been a
> >> >> > >> while since I supressed it in policy. I might try to reproduce. AFAIK my
> >> >> > >> policy is the only policy that actually labels some trees on cgroup2 fs
> >> >> > >> with private types currently.
> >> >> > >>
> >> >> > >> >
> >> >> > >> >>   I decided to dontaudit attempts to write to the mislabeled object and
> >> >> > >> >>   it *seems* as if systemd retries until it can write it i.e. when the
> >> >> > >> >>   object carries the expected label and so that seems to work eventually
> >> >> > >> >>   but it looks fragile.
> >> >> > >> >>
> >> >> > >> >> * The challenge with memory pressure implementation [2] is that these
> >> >> > >> >>   "memory.pressure" files end up in random locations under
> >> >> > >> >>   "/system.slice" for example:
> >> >> > >> >>
> >> >> > >> >>   /sys/fs/cgroup/system.slice/systemd-journald.service/memory.pressure
> >> >> > >> >>
> >> >> > >> >>   Where in the above systemd-journald.service might be
> >> >> > >> >>   templated (systemd-journald@FOO.service). Point is that the path is
> >> >> > >> >>   random. genfscon does not support regex and glob. I can't do for example:
> >> >> > >> >>
> >> >> > >> >>   (genfscon "cgroup2" "/system.slice/.*/memory.pressure"
> >> >> > >> >>   cgroupfile_context)
> >> >> > >> >>
> >> >> > >> >>   Fortunately cgroup2fs supports relabeling but if systemd has to
> >> >> > >> >>   manually relabel the cgroup files then I would imagine that this is
> >> >> > >> >>   racy as well, and that does not really solve the underlying issue.
> >> >> > >> >>
> >> >> > >> >>   I am looking for ideas and suggestions
> >> >> > >> >
> >> >> > >> > Optimally one of two things would happen:
> >> >> > >> > 1. The kernel would label the inode correctly when it is first created
> >> >> > >> > (e.g. by augmenting genfscon to support more general matching), or
> >> >> > >> > 2. The userspace component that creates these files would label them
> >> >> > >> > correctly at creation (via setfscreatecon() prior to creation).
> >> >> > >>
> >> >> > >> Agree but 1. would require regex/glob support for genfscon and 2. these
> >> >> > >> files aren't "created" by userspace AFAIK and so setfscreatecon or
> >> >> > >> automatic object type transitions are probably not an option here.
> >> >> > >>
> >> >> > >> >
> >> >> > >> > Pardon my ignorance but what creates these files initially? The kernel
> >> >> > >> > in response to some event or systemd or some other userspace
> >> >> > >> > component?
> >> >> > >>
> >> >> > >> Yes AFAIK it is the former (psuedo filesystem similar to procfs, debugfs
> >> >> > >> in that sense). This is also why I don't think that the PR mentioned is
> >> >> > >> tested because cgroup2 fs labeling is done with genfscon and not fsuse
> >> >> > >> trans or fsuse xattr so even if the files would be created by
> >> >> > >> userspace (which I think is not the case) the specified automatic object
> >> >> > >> type transition rule wouldnt work.
> >> >> > >
> >> >> > > Actually, type transitions on cgroupfs should work - I added special
> >> >> > > hooks for kernfs just for that some time ago - see kernel commits
> >> >> > > d0c9c153b4bd6963c8fcccbc0caa12e8fa8d971d..e19dfdc83b60f196e0653d683499f7bc5548128f.
> >> >> >
> >> >> > Interesting. I will try this out. Would this not require at least a
> >> >> > "fsuse trans" statement in policy?
> >> >>
> >> >> No, it should work alongside genfscon. cgroupfs already was special
> >> >> before that as it allowed relabeling despite genfscon being used.
> >> >>
> >> >> >
> >> >> > https://github.com/SELinuxProject/refpolicy/blob/master/policy/modules/kernel/filesystem.te#L89
> >> >> >
> >> >> > Also I am not sure if that support would make much sense on a filesystem
> >> >> > where files are created my the kernel in reaction to some event.
> >> >>
> >> >> It does make sense with named transitions, plus it was needed to make
> >> >> even a simple parent-child inheritance work. Also, I believe some
> >> >> cgroupfs files/directories (I think only directories?) can be created
> >> >> by userspace, too.
> >> >
> >> > We should likely check that the SELinux Notebook and/or other
> >> > documentation reflects this support and which filesystem types are
> >> > supported, both wrt the filesystem types that support both genfscon +
> >> > setxattr and those that support genfscon+setxattr+type_transition
> >> > rules.
> >>
> >> I tried this out:
> >>
> >> 1. yes you can create dirs on cgroup2 fs (but not files)
> >> 2. you can have a genfscon "cgroup2" alongside fsuse trans "cgroup2" but
> >> if you do then any genfscon statements you might have like for example
> >> genfscon "cgroup2" "/user.slice" cgroupfile_context) no longer
> >> work. i.e. its pointless to have then both
> >> 3. even with a fsuse trans statement I could not make type transitions
> >> work for directories created on cgroup2 fs.
> >>
> >> Even if you could create directories on a cgroupfs with a type
> >> transition, and if the files under that directory would inherited the
> >> type of the parent, then that still would not be good enough to address
> >> the memory.pressure file challenge because the point is to allow a
> >> service to write the memory.pressure file but not other files in that
> >> same directory.
> >
> > You don't want a fs_use_trans statement in your policy for cgroup2.
> > Just genfscon statements. The kernel will still check for
> > type_transition rules and apply them to files at creation time without
> > having a fs_use_trans, but having a fs_use_trans will override
> > genfscon.
>
> Thanks for clarification. to reiterate I could not get type_transition
> to work with only genfscon either when creating a directory on cgroup2
> fs but maybe I was overlooking something.

Hmm...that's interesting. I just tried in Fedora using one of the
type_transitions already defined in the default policy and although it
appears to use the type_transition to compute the new SID for the
create check, ls -Z of the file after creation showed it labeled
cgroup_t instead. So it doesn't appear to be working or I am doing it
wrong.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 17:28                   ` Stephen Smalley
@ 2023-03-20 17:53                     ` Stephen Smalley
  2023-03-20 18:07                       ` Dominick Grift
  2023-03-20 18:15                       ` Stephen Smalley
  0 siblings, 2 replies; 29+ messages in thread
From: Stephen Smalley @ 2023-03-20 17:53 UTC (permalink / raw)
  To: Dominick Grift; +Cc: Ondrej Mosnacek, Paul Moore, selinux

On Mon, Mar 20, 2023 at 1:28 PM Stephen Smalley
<stephen.smalley.work@gmail.com> wrote:
> Hmm...that's interesting. I just tried in Fedora using one of the
> type_transitions already defined in the default policy and although it
> appears to use the type_transition to compute the new SID for the
> create check, ls -Z of the file after creation showed it labeled
> cgroup_t instead. So it doesn't appear to be working or I am doing it
> wrong.

Reproducer, on F34,
$ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
mkdir: cannot create directory
‘/sys/fs/cgroup/system.slice/.snapshots’: Permission denied
$ sudo ausearch -m AVC -ts recent -i
----
type=AVC msg=audit(03/20/2023 13:00:04.699:47156) : avc:  denied  {
associate } for  pid=152325 comm=mkdir name=.snapshots
scontext=unconfined_u:object_r:snapperd_data_t:s0
tcontext=system_u:object_r:cgroup_t:s0 tclass=filesystem permissive=0
$ seinfo --fs_use | grep cgroup
$ seinfo --genfscon | grep cgroup
   genfscon cgroup /  system_u:object_r:cgroup_t:s0
   genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
$ sesearch -T -s unconfined_t -t cgroup_t -c dir
type_transition unconfined_t cgroup_t:dir snapperd_data_t .snapshots
$ sudo setenforce 0
$ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
$ ls -Zd /sys/fs/cgroup/system.slice/.snapshots
system_u:object_r:cgroup_t:s0 /sys/fs/cgroup/system.slice/.snapshots

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 17:53                     ` Stephen Smalley
@ 2023-03-20 18:07                       ` Dominick Grift
  2023-03-20 18:22                         ` Christian Göttsche
  2023-03-20 18:15                       ` Stephen Smalley
  1 sibling, 1 reply; 29+ messages in thread
From: Dominick Grift @ 2023-03-20 18:07 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: Ondrej Mosnacek, Paul Moore, selinux

Stephen Smalley <stephen.smalley.work@gmail.com> writes:

> On Mon, Mar 20, 2023 at 1:28 PM Stephen Smalley
> <stephen.smalley.work@gmail.com> wrote:
>> Hmm...that's interesting. I just tried in Fedora using one of the
>> type_transitions already defined in the default policy and although it
>> appears to use the type_transition to compute the new SID for the
>> create check, ls -Z of the file after creation showed it labeled
>> cgroup_t instead. So it doesn't appear to be working or I am doing it
>> wrong.

I am totally confused now as well because Christian on IRC say's it
works for him but I cannot get it to work here and I tried various
combinations

>
> Reproducer, on F34,
> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
> mkdir: cannot create directory
> ‘/sys/fs/cgroup/system.slice/.snapshots’: Permission denied
> $ sudo ausearch -m AVC -ts recent -i
> ----
> type=AVC msg=audit(03/20/2023 13:00:04.699:47156) : avc:  denied  {
> associate } for  pid=152325 comm=mkdir name=.snapshots
> scontext=unconfined_u:object_r:snapperd_data_t:s0
> tcontext=system_u:object_r:cgroup_t:s0 tclass=filesystem permissive=0
> $ seinfo --fs_use | grep cgroup
> $ seinfo --genfscon | grep cgroup
>    genfscon cgroup /  system_u:object_r:cgroup_t:s0
>    genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
> $ sesearch -T -s unconfined_t -t cgroup_t -c dir
> type_transition unconfined_t cgroup_t:dir snapperd_data_t .snapshots
> $ sudo setenforce 0
> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
> $ ls -Zd /sys/fs/cgroup/system.slice/.snapshots
> system_u:object_r:cgroup_t:s0 /sys/fs/cgroup/system.slice/.snapshots

-- 
gpg --locate-keys dominick.grift@defensec.nl
Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
Dominick Grift

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 17:53                     ` Stephen Smalley
  2023-03-20 18:07                       ` Dominick Grift
@ 2023-03-20 18:15                       ` Stephen Smalley
  2023-03-20 18:19                         ` Dominick Grift
  1 sibling, 1 reply; 29+ messages in thread
From: Stephen Smalley @ 2023-03-20 18:15 UTC (permalink / raw)
  To: Dominick Grift; +Cc: Ondrej Mosnacek, Paul Moore, selinux

On Mon, Mar 20, 2023 at 1:53 PM Stephen Smalley
<stephen.smalley.work@gmail.com> wrote:
>
> On Mon, Mar 20, 2023 at 1:28 PM Stephen Smalley
> <stephen.smalley.work@gmail.com> wrote:
> > Hmm...that's interesting. I just tried in Fedora using one of the
> > type_transitions already defined in the default policy and although it
> > appears to use the type_transition to compute the new SID for the
> > create check, ls -Z of the file after creation showed it labeled
> > cgroup_t instead. So it doesn't appear to be working or I am doing it
> > wrong.
>
> Reproducer, on F34,
> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
> mkdir: cannot create directory
> ‘/sys/fs/cgroup/system.slice/.snapshots’: Permission denied
> $ sudo ausearch -m AVC -ts recent -i
> ----
> type=AVC msg=audit(03/20/2023 13:00:04.699:47156) : avc:  denied  {
> associate } for  pid=152325 comm=mkdir name=.snapshots
> scontext=unconfined_u:object_r:snapperd_data_t:s0
> tcontext=system_u:object_r:cgroup_t:s0 tclass=filesystem permissive=0
> $ seinfo --fs_use | grep cgroup
> $ seinfo --genfscon | grep cgroup
>    genfscon cgroup /  system_u:object_r:cgroup_t:s0
>    genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
> $ sesearch -T -s unconfined_t -t cgroup_t -c dir
> type_transition unconfined_t cgroup_t:dir snapperd_data_t .snapshots
> $ sudo setenforce 0
> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
> $ ls -Zd /sys/fs/cgroup/system.slice/.snapshots
> system_u:object_r:cgroup_t:s0 /sys/fs/cgroup/system.slice/.snapshots

Unless systemd is coming along after file creation and relabeling it
to cgroup_t at that time.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 18:15                       ` Stephen Smalley
@ 2023-03-20 18:19                         ` Dominick Grift
  2023-03-20 18:22                           ` Stephen Smalley
  0 siblings, 1 reply; 29+ messages in thread
From: Dominick Grift @ 2023-03-20 18:19 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: Ondrej Mosnacek, Paul Moore, selinux

Stephen Smalley <stephen.smalley.work@gmail.com> writes:

> On Mon, Mar 20, 2023 at 1:53 PM Stephen Smalley
> <stephen.smalley.work@gmail.com> wrote:
>>
>> On Mon, Mar 20, 2023 at 1:28 PM Stephen Smalley
>> <stephen.smalley.work@gmail.com> wrote:
>> > Hmm...that's interesting. I just tried in Fedora using one of the
>> > type_transitions already defined in the default policy and although it
>> > appears to use the type_transition to compute the new SID for the
>> > create check, ls -Z of the file after creation showed it labeled
>> > cgroup_t instead. So it doesn't appear to be working or I am doing it
>> > wrong.
>>
>> Reproducer, on F34,
>> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
>> mkdir: cannot create directory
>> ‘/sys/fs/cgroup/system.slice/.snapshots’: Permission denied
>> $ sudo ausearch -m AVC -ts recent -i
>> ----
>> type=AVC msg=audit(03/20/2023 13:00:04.699:47156) : avc:  denied  {
>> associate } for  pid=152325 comm=mkdir name=.snapshots
>> scontext=unconfined_u:object_r:snapperd_data_t:s0
>> tcontext=system_u:object_r:cgroup_t:s0 tclass=filesystem permissive=0
>> $ seinfo --fs_use | grep cgroup
>> $ seinfo --genfscon | grep cgroup
>>    genfscon cgroup /  system_u:object_r:cgroup_t:s0
>>    genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
>> $ sesearch -T -s unconfined_t -t cgroup_t -c dir
>> type_transition unconfined_t cgroup_t:dir snapperd_data_t .snapshots
>> $ sudo setenforce 0
>> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
>> $ ls -Zd /sys/fs/cgroup/system.slice/.snapshots
>> system_u:object_r:cgroup_t:s0 /sys/fs/cgroup/system.slice/.snapshots
>
> Unless systemd is coming along after file creation and relabeling it
> to cgroup_t at that time.

That wouldnt make sense to me, but yes i considered that as well. Ruled
it out without actually confirming it. I actually added a rule:

auditallow domain cgroup_t:dir create;

and that also does not show grants for all the dirs in /sys/fs/cgroup
(just some)

voodoo

-- 
gpg --locate-keys dominick.grift@defensec.nl
Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
Dominick Grift

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 18:19                         ` Dominick Grift
@ 2023-03-20 18:22                           ` Stephen Smalley
  2023-03-20 18:26                             ` Dominick Grift
  0 siblings, 1 reply; 29+ messages in thread
From: Stephen Smalley @ 2023-03-20 18:22 UTC (permalink / raw)
  To: Dominick Grift; +Cc: Ondrej Mosnacek, Paul Moore, selinux

On Mon, Mar 20, 2023 at 2:19 PM Dominick Grift
<dominick.grift@defensec.nl> wrote:
>
> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
>
> > On Mon, Mar 20, 2023 at 1:53 PM Stephen Smalley
> > <stephen.smalley.work@gmail.com> wrote:
> >>
> >> On Mon, Mar 20, 2023 at 1:28 PM Stephen Smalley
> >> <stephen.smalley.work@gmail.com> wrote:
> >> > Hmm...that's interesting. I just tried in Fedora using one of the
> >> > type_transitions already defined in the default policy and although it
> >> > appears to use the type_transition to compute the new SID for the
> >> > create check, ls -Z of the file after creation showed it labeled
> >> > cgroup_t instead. So it doesn't appear to be working or I am doing it
> >> > wrong.
> >>
> >> Reproducer, on F34,
> >> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
> >> mkdir: cannot create directory
> >> ‘/sys/fs/cgroup/system.slice/.snapshots’: Permission denied
> >> $ sudo ausearch -m AVC -ts recent -i
> >> ----
> >> type=AVC msg=audit(03/20/2023 13:00:04.699:47156) : avc:  denied  {
> >> associate } for  pid=152325 comm=mkdir name=.snapshots
> >> scontext=unconfined_u:object_r:snapperd_data_t:s0
> >> tcontext=system_u:object_r:cgroup_t:s0 tclass=filesystem permissive=0
> >> $ seinfo --fs_use | grep cgroup
> >> $ seinfo --genfscon | grep cgroup
> >>    genfscon cgroup /  system_u:object_r:cgroup_t:s0
> >>    genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
> >> $ sesearch -T -s unconfined_t -t cgroup_t -c dir
> >> type_transition unconfined_t cgroup_t:dir snapperd_data_t .snapshots
> >> $ sudo setenforce 0
> >> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
> >> $ ls -Zd /sys/fs/cgroup/system.slice/.snapshots
> >> system_u:object_r:cgroup_t:s0 /sys/fs/cgroup/system.slice/.snapshots
> >
> > Unless systemd is coming along after file creation and relabeling it
> > to cgroup_t at that time.
>
> That wouldnt make sense to me, but yes i considered that as well. Ruled
> it out without actually confirming it. I actually added a rule:
>
> auditallow domain cgroup_t:dir create;
>
> and that also does not show grants for all the dirs in /sys/fs/cgroup
> (just some)
>
> voodoo

It wouldn't be create but rather relabelto permission (if systemd is
relabeling the file after the kernel creates it).

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 18:07                       ` Dominick Grift
@ 2023-03-20 18:22                         ` Christian Göttsche
  2023-03-20 20:23                           ` Stephen Smalley
  0 siblings, 1 reply; 29+ messages in thread
From: Christian Göttsche @ 2023-03-20 18:22 UTC (permalink / raw)
  To: Dominick Grift; +Cc: Stephen Smalley, Ondrej Mosnacek, Paul Moore, selinux

On Mon, 20 Mar 2023 at 19:14, Dominick Grift <dominick.grift@defensec.nl> wrote:
>
> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
>
> > On Mon, Mar 20, 2023 at 1:28 PM Stephen Smalley
> > <stephen.smalley.work@gmail.com> wrote:
> >> Hmm...that's interesting. I just tried in Fedora using one of the
> >> type_transitions already defined in the default policy and although it
> >> appears to use the type_transition to compute the new SID for the
> >> create check, ls -Z of the file after creation showed it labeled
> >> cgroup_t instead. So it doesn't appear to be working or I am doing it
> >> wrong.
>
> I am totally confused now as well because Christian on IRC say's it
> works for him but I cannot get it to work here and I tried various
> combinations
>
> >
> > Reproducer, on F34,
> > $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
> > mkdir: cannot create directory
> > ‘/sys/fs/cgroup/system.slice/.snapshots’: Permission denied
> > $ sudo ausearch -m AVC -ts recent -i
> > ----
> > type=AVC msg=audit(03/20/2023 13:00:04.699:47156) : avc:  denied  {
> > associate } for  pid=152325 comm=mkdir name=.snapshots
> > scontext=unconfined_u:object_r:snapperd_data_t:s0
> > tcontext=system_u:object_r:cgroup_t:s0 tclass=filesystem permissive=0
> > $ seinfo --fs_use | grep cgroup
> > $ seinfo --genfscon | grep cgroup
> >    genfscon cgroup /  system_u:object_r:cgroup_t:s0
> >    genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
> > $ sesearch -T -s unconfined_t -t cgroup_t -c dir
> > type_transition unconfined_t cgroup_t:dir snapperd_data_t .snapshots
> > $ sudo setenforce 0
> > $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
> > $ ls -Zd /sys/fs/cgroup/system.slice/.snapshots
> > system_u:object_r:cgroup_t:s0 /sys/fs/cgroup/system.slice/.snapshots
>
> --
> gpg --locate-keys dominick.grift@defensec.nl
> Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
> Dominick Grift

Debian sid (Linux debianBullseye 6.1.0-6-amd64 #1 SMP PREEMPT_DYNAMIC
Debian 6.1.15-1 (2023-03-05) x86_64 GNU/Linux):

type cgroup_test_t;
allow cgroup_test_t cgroup_t:filesystem associate;
filetrans_pattern(sysadm_t, cgroup_t, cgroup_test_t, dir, "testdir")
allow sysadm_t cgroup_test_t:dir { create_dir_perms list_dir_perms };
allow sysadm_t cgroup_test_t:file getattr;


$ seinfo --all | grep cgroup
genfscon cgroup /  system_u:object_r:cgroup_t:s0
genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
genfscon proc /cgroups  system_u:object_r:proc_info_t:s0
cgroup_seclabel
cgroup_t
cgroup_test_t
systemd_cgroups_agent_exec_t
systemd_cgroups_agent_runtime_t
systemd_cgroups_agent_t


$ grep cgroup /etc/selinux/debian/contexts/files/file_contexts
/cgroup/.*              <<none>>
/sys/fs/cgroup/.*               <<none>>
/sys/fs/cgroup/[^/]+            -l      system_u:object_r:cgroup_t:s0
/cgroup         -d      system_u:object_r:cgroup_t:s0
/sys/fs/cgroup          -d      system_u:object_r:cgroup_t:s0
/usr/lib/systemd/systemd-cgroups-agent          --
system_u:object_r:systemd_cgroups_agent_exec_t:s0


$ mkdir /sys/fs/cgroup/system.slice/testdir
$ ls -laZ /sys/fs/cgroup/system.slice/testdir/
total 0
drwxr-x---.  2 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
.
drwxr-xr-x. 19 root root system_u:object_r:cgroup_t:s0  0 Mar 20 19:19
..
-r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
cgroup.controllers
-r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
cgroup.events
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
cgroup.freeze
--w-------.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
cgroup.kill
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
cgroup.max.depth
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
cgroup.max.descendants
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
cgroup.pressure
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
cgroup.procs
-r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
cgroup.stat
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
cgroup.subtree_control
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
cgroup.threads
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
cgroup.type
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
cpu.pressure
-r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
cpu.stat
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
io.pressure
-r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.current
-r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.events
-r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.events.local
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.high
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.low
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.max
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.min
-r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.numa_stat
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.oom.group
-r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.peak
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.pressure
--w-------.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.reclaim
-r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.stat
-r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.swap.current
-r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.swap.events
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.swap.high
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.swap.max
-r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.zswap.current
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
memory.zswap.max
-r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
pids.current
-r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
pids.events
-rw-r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19 pids.max
-r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19 pids.peak

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 18:22                           ` Stephen Smalley
@ 2023-03-20 18:26                             ` Dominick Grift
  0 siblings, 0 replies; 29+ messages in thread
From: Dominick Grift @ 2023-03-20 18:26 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: Ondrej Mosnacek, Paul Moore, selinux

Stephen Smalley <stephen.smalley.work@gmail.com> writes:

> On Mon, Mar 20, 2023 at 2:19 PM Dominick Grift
> <dominick.grift@defensec.nl> wrote:
>>
>> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
>>
>> > On Mon, Mar 20, 2023 at 1:53 PM Stephen Smalley
>> > <stephen.smalley.work@gmail.com> wrote:
>> >>
>> >> On Mon, Mar 20, 2023 at 1:28 PM Stephen Smalley
>> >> <stephen.smalley.work@gmail.com> wrote:
>> >> > Hmm...that's interesting. I just tried in Fedora using one of the
>> >> > type_transitions already defined in the default policy and although it
>> >> > appears to use the type_transition to compute the new SID for the
>> >> > create check, ls -Z of the file after creation showed it labeled
>> >> > cgroup_t instead. So it doesn't appear to be working or I am doing it
>> >> > wrong.
>> >>
>> >> Reproducer, on F34,
>> >> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
>> >> mkdir: cannot create directory
>> >> ‘/sys/fs/cgroup/system.slice/.snapshots’: Permission denied
>> >> $ sudo ausearch -m AVC -ts recent -i
>> >> ----
>> >> type=AVC msg=audit(03/20/2023 13:00:04.699:47156) : avc:  denied  {
>> >> associate } for  pid=152325 comm=mkdir name=.snapshots
>> >> scontext=unconfined_u:object_r:snapperd_data_t:s0
>> >> tcontext=system_u:object_r:cgroup_t:s0 tclass=filesystem permissive=0
>> >> $ seinfo --fs_use | grep cgroup
>> >> $ seinfo --genfscon | grep cgroup
>> >>    genfscon cgroup /  system_u:object_r:cgroup_t:s0
>> >>    genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
>> >> $ sesearch -T -s unconfined_t -t cgroup_t -c dir
>> >> type_transition unconfined_t cgroup_t:dir snapperd_data_t .snapshots
>> >> $ sudo setenforce 0
>> >> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
>> >> $ ls -Zd /sys/fs/cgroup/system.slice/.snapshots
>> >> system_u:object_r:cgroup_t:s0 /sys/fs/cgroup/system.slice/.snapshots
>> >
>> > Unless systemd is coming along after file creation and relabeling it
>> > to cgroup_t at that time.
>>
>> That wouldnt make sense to me, but yes i considered that as well. Ruled
>> it out without actually confirming it. I actually added a rule:
>>
>> auditallow domain cgroup_t:dir create;
>>
>> and that also does not show grants for all the dirs in /sys/fs/cgroup
>> (just some)
>>
>> voodoo
>
> It wouldn't be create but rather relabelto permission (if systemd is
> relabeling the file after the kernel creates it).

Yes I know but I didn't add it to audit relabelto, i added it to audit
the create since the dirs are created there in the first place (i guess).

Even though I doubt that a relabel resets it - I will try it out just to
confirm. Something does not add up.

-- 
gpg --locate-keys dominick.grift@defensec.nl
Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
Dominick Grift

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 18:22                         ` Christian Göttsche
@ 2023-03-20 20:23                           ` Stephen Smalley
  2023-03-21 13:40                             ` Ondrej Mosnacek
  0 siblings, 1 reply; 29+ messages in thread
From: Stephen Smalley @ 2023-03-20 20:23 UTC (permalink / raw)
  To: Christian Göttsche
  Cc: Dominick Grift, Ondrej Mosnacek, Paul Moore, selinux

On Mon, Mar 20, 2023 at 2:22 PM Christian Göttsche
<cgzones@googlemail.com> wrote:
>
> On Mon, 20 Mar 2023 at 19:14, Dominick Grift <dominick.grift@defensec.nl> wrote:
> >
> > Stephen Smalley <stephen.smalley.work@gmail.com> writes:
> >
> > > On Mon, Mar 20, 2023 at 1:28 PM Stephen Smalley
> > > <stephen.smalley.work@gmail.com> wrote:
> > >> Hmm...that's interesting. I just tried in Fedora using one of the
> > >> type_transitions already defined in the default policy and although it
> > >> appears to use the type_transition to compute the new SID for the
> > >> create check, ls -Z of the file after creation showed it labeled
> > >> cgroup_t instead. So it doesn't appear to be working or I am doing it
> > >> wrong.
> >
> > I am totally confused now as well because Christian on IRC say's it
> > works for him but I cannot get it to work here and I tried various
> > combinations
> >
> > >
> > > Reproducer, on F34,
> > > $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
> > > mkdir: cannot create directory
> > > ‘/sys/fs/cgroup/system.slice/.snapshots’: Permission denied
> > > $ sudo ausearch -m AVC -ts recent -i
> > > ----
> > > type=AVC msg=audit(03/20/2023 13:00:04.699:47156) : avc:  denied  {
> > > associate } for  pid=152325 comm=mkdir name=.snapshots
> > > scontext=unconfined_u:object_r:snapperd_data_t:s0
> > > tcontext=system_u:object_r:cgroup_t:s0 tclass=filesystem permissive=0
> > > $ seinfo --fs_use | grep cgroup
> > > $ seinfo --genfscon | grep cgroup
> > >    genfscon cgroup /  system_u:object_r:cgroup_t:s0
> > >    genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
> > > $ sesearch -T -s unconfined_t -t cgroup_t -c dir
> > > type_transition unconfined_t cgroup_t:dir snapperd_data_t .snapshots
> > > $ sudo setenforce 0
> > > $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
> > > $ ls -Zd /sys/fs/cgroup/system.slice/.snapshots
> > > system_u:object_r:cgroup_t:s0 /sys/fs/cgroup/system.slice/.snapshots
> >
> > --
> > gpg --locate-keys dominick.grift@defensec.nl
> > Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
> > Dominick Grift
>
> Debian sid (Linux debianBullseye 6.1.0-6-amd64 #1 SMP PREEMPT_DYNAMIC
> Debian 6.1.15-1 (2023-03-05) x86_64 GNU/Linux):
>
> type cgroup_test_t;
> allow cgroup_test_t cgroup_t:filesystem associate;
> filetrans_pattern(sysadm_t, cgroup_t, cgroup_test_t, dir, "testdir")
> allow sysadm_t cgroup_test_t:dir { create_dir_perms list_dir_perms };
> allow sysadm_t cgroup_test_t:file getattr;
>
>
> $ seinfo --all | grep cgroup
> genfscon cgroup /  system_u:object_r:cgroup_t:s0
> genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
> genfscon proc /cgroups  system_u:object_r:proc_info_t:s0
> cgroup_seclabel
> cgroup_t
> cgroup_test_t
> systemd_cgroups_agent_exec_t
> systemd_cgroups_agent_runtime_t
> systemd_cgroups_agent_t
>
>
> $ grep cgroup /etc/selinux/debian/contexts/files/file_contexts
> /cgroup/.*              <<none>>
> /sys/fs/cgroup/.*               <<none>>
> /sys/fs/cgroup/[^/]+            -l      system_u:object_r:cgroup_t:s0
> /cgroup         -d      system_u:object_r:cgroup_t:s0
> /sys/fs/cgroup          -d      system_u:object_r:cgroup_t:s0
> /usr/lib/systemd/systemd-cgroups-agent          --
> system_u:object_r:systemd_cgroups_agent_exec_t:s0
>
>
> $ mkdir /sys/fs/cgroup/system.slice/testdir
> $ ls -laZ /sys/fs/cgroup/system.slice/testdir/
> total 0
> drwxr-x---.  2 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
> .
> drwxr-xr-x. 19 root root system_u:object_r:cgroup_t:s0  0 Mar 20 19:19
> ..
> -r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
> cgroup.controllers
> -r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
> cgroup.events

Hmm...I don't get the same result with 6.1.14-200.fc37.x86_64, using
the corresponding slightly tweaked policy module:
policy_module(cgrouptest, 1.0)
require {
type cgroup_t;
type unconfined_t;
}
type cgroup_test_t;
allow cgroup_test_t cgroup_t:filesystem associate;
filetrans_pattern(unconfined_t, cgroup_t, cgroup_test_t, dir, "testdir")
allow unconfined_t cgroup_test_t:dir { create_dir_perms list_dir_perms };
allow unconfined_t cgroup_test_t:file getattr;

That's on Fedora 37, not 34, sorry for the typo.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-20 20:23                           ` Stephen Smalley
@ 2023-03-21 13:40                             ` Ondrej Mosnacek
  2023-03-21 14:42                               ` Dominick Grift
  0 siblings, 1 reply; 29+ messages in thread
From: Ondrej Mosnacek @ 2023-03-21 13:40 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Christian Göttsche, Dominick Grift, Paul Moore, selinux

On Mon, Mar 20, 2023 at 9:23 PM Stephen Smalley
<stephen.smalley.work@gmail.com> wrote:
>
> On Mon, Mar 20, 2023 at 2:22 PM Christian Göttsche
> <cgzones@googlemail.com> wrote:
> >
> > On Mon, 20 Mar 2023 at 19:14, Dominick Grift <dominick.grift@defensec.nl> wrote:
> > >
> > > Stephen Smalley <stephen.smalley.work@gmail.com> writes:
> > >
> > > > On Mon, Mar 20, 2023 at 1:28 PM Stephen Smalley
> > > > <stephen.smalley.work@gmail.com> wrote:
> > > >> Hmm...that's interesting. I just tried in Fedora using one of the
> > > >> type_transitions already defined in the default policy and although it
> > > >> appears to use the type_transition to compute the new SID for the
> > > >> create check, ls -Z of the file after creation showed it labeled
> > > >> cgroup_t instead. So it doesn't appear to be working or I am doing it
> > > >> wrong.
> > >
> > > I am totally confused now as well because Christian on IRC say's it
> > > works for him but I cannot get it to work here and I tried various
> > > combinations
> > >
> > > >
> > > > Reproducer, on F34,
> > > > $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
> > > > mkdir: cannot create directory
> > > > ‘/sys/fs/cgroup/system.slice/.snapshots’: Permission denied
> > > > $ sudo ausearch -m AVC -ts recent -i
> > > > ----
> > > > type=AVC msg=audit(03/20/2023 13:00:04.699:47156) : avc:  denied  {
> > > > associate } for  pid=152325 comm=mkdir name=.snapshots
> > > > scontext=unconfined_u:object_r:snapperd_data_t:s0
> > > > tcontext=system_u:object_r:cgroup_t:s0 tclass=filesystem permissive=0
> > > > $ seinfo --fs_use | grep cgroup
> > > > $ seinfo --genfscon | grep cgroup
> > > >    genfscon cgroup /  system_u:object_r:cgroup_t:s0
> > > >    genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
> > > > $ sesearch -T -s unconfined_t -t cgroup_t -c dir
> > > > type_transition unconfined_t cgroup_t:dir snapperd_data_t .snapshots
> > > > $ sudo setenforce 0
> > > > $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
> > > > $ ls -Zd /sys/fs/cgroup/system.slice/.snapshots
> > > > system_u:object_r:cgroup_t:s0 /sys/fs/cgroup/system.slice/.snapshots
> > >
> > > --
> > > gpg --locate-keys dominick.grift@defensec.nl
> > > Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
> > > Dominick Grift
> >
> > Debian sid (Linux debianBullseye 6.1.0-6-amd64 #1 SMP PREEMPT_DYNAMIC
> > Debian 6.1.15-1 (2023-03-05) x86_64 GNU/Linux):
> >
> > type cgroup_test_t;
> > allow cgroup_test_t cgroup_t:filesystem associate;
> > filetrans_pattern(sysadm_t, cgroup_t, cgroup_test_t, dir, "testdir")
> > allow sysadm_t cgroup_test_t:dir { create_dir_perms list_dir_perms };
> > allow sysadm_t cgroup_test_t:file getattr;
> >
> >
> > $ seinfo --all | grep cgroup
> > genfscon cgroup /  system_u:object_r:cgroup_t:s0
> > genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
> > genfscon proc /cgroups  system_u:object_r:proc_info_t:s0
> > cgroup_seclabel
> > cgroup_t
> > cgroup_test_t
> > systemd_cgroups_agent_exec_t
> > systemd_cgroups_agent_runtime_t
> > systemd_cgroups_agent_t
> >
> >
> > $ grep cgroup /etc/selinux/debian/contexts/files/file_contexts
> > /cgroup/.*              <<none>>
> > /sys/fs/cgroup/.*               <<none>>
> > /sys/fs/cgroup/[^/]+            -l      system_u:object_r:cgroup_t:s0
> > /cgroup         -d      system_u:object_r:cgroup_t:s0
> > /sys/fs/cgroup          -d      system_u:object_r:cgroup_t:s0
> > /usr/lib/systemd/systemd-cgroups-agent          --
> > system_u:object_r:systemd_cgroups_agent_exec_t:s0
> >
> >
> > $ mkdir /sys/fs/cgroup/system.slice/testdir
> > $ ls -laZ /sys/fs/cgroup/system.slice/testdir/
> > total 0
> > drwxr-x---.  2 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
> > .
> > drwxr-xr-x. 19 root root system_u:object_r:cgroup_t:s0  0 Mar 20 19:19
> > ..
> > -r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
> > cgroup.controllers
> > -r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
> > cgroup.events
>
> Hmm...I don't get the same result with 6.1.14-200.fc37.x86_64, using
> the corresponding slightly tweaked policy module:
> policy_module(cgrouptest, 1.0)
> require {
> type cgroup_t;
> type unconfined_t;
> }
> type cgroup_test_t;
> allow cgroup_test_t cgroup_t:filesystem associate;
> filetrans_pattern(unconfined_t, cgroup_t, cgroup_test_t, dir, "testdir")
> allow unconfined_t cgroup_test_t:dir { create_dir_perms list_dir_perms };
> allow unconfined_t cgroup_test_t:file getattr;
>
> That's on Fedora 37, not 34, sorry for the typo.

Ah, now I remembered that we made it such that the transitions would
only apply if the parent directory has a label explicitly set by
userspace (via setxattr). Not sure if we can improve it easily, since
we can't use the normal inode-based logic for cgroupfs (the xattrs are
stored in kernfs nodes, each of which can be exposed via multiple
inodes if there is more than one cgroupfs mount).

-- 
Ondrej Mosnacek
Senior Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-21 13:40                             ` Ondrej Mosnacek
@ 2023-03-21 14:42                               ` Dominick Grift
  2023-03-22 17:07                                 ` Matthew Sheets
  0 siblings, 1 reply; 29+ messages in thread
From: Dominick Grift @ 2023-03-21 14:42 UTC (permalink / raw)
  To: Ondrej Mosnacek
  Cc: Stephen Smalley, Christian Göttsche, Paul Moore, selinux

Ondrej Mosnacek <omosnace@redhat.com> writes:

> On Mon, Mar 20, 2023 at 9:23 PM Stephen Smalley
> <stephen.smalley.work@gmail.com> wrote:
>>
>> On Mon, Mar 20, 2023 at 2:22 PM Christian Göttsche
>> <cgzones@googlemail.com> wrote:
>> >
>> > On Mon, 20 Mar 2023 at 19:14, Dominick Grift <dominick.grift@defensec.nl> wrote:
>> > >
>> > > Stephen Smalley <stephen.smalley.work@gmail.com> writes:
>> > >
>> > > > On Mon, Mar 20, 2023 at 1:28 PM Stephen Smalley
>> > > > <stephen.smalley.work@gmail.com> wrote:
>> > > >> Hmm...that's interesting. I just tried in Fedora using one of the
>> > > >> type_transitions already defined in the default policy and although it
>> > > >> appears to use the type_transition to compute the new SID for the
>> > > >> create check, ls -Z of the file after creation showed it labeled
>> > > >> cgroup_t instead. So it doesn't appear to be working or I am doing it
>> > > >> wrong.
>> > >
>> > > I am totally confused now as well because Christian on IRC say's it
>> > > works for him but I cannot get it to work here and I tried various
>> > > combinations
>> > >
>> > > >
>> > > > Reproducer, on F34,
>> > > > $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
>> > > > mkdir: cannot create directory
>> > > > ‘/sys/fs/cgroup/system.slice/.snapshots’: Permission denied
>> > > > $ sudo ausearch -m AVC -ts recent -i
>> > > > ----
>> > > > type=AVC msg=audit(03/20/2023 13:00:04.699:47156) : avc:  denied  {
>> > > > associate } for  pid=152325 comm=mkdir name=.snapshots
>> > > > scontext=unconfined_u:object_r:snapperd_data_t:s0
>> > > > tcontext=system_u:object_r:cgroup_t:s0 tclass=filesystem permissive=0
>> > > > $ seinfo --fs_use | grep cgroup
>> > > > $ seinfo --genfscon | grep cgroup
>> > > >    genfscon cgroup /  system_u:object_r:cgroup_t:s0
>> > > >    genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
>> > > > $ sesearch -T -s unconfined_t -t cgroup_t -c dir
>> > > > type_transition unconfined_t cgroup_t:dir snapperd_data_t .snapshots
>> > > > $ sudo setenforce 0
>> > > > $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
>> > > > $ ls -Zd /sys/fs/cgroup/system.slice/.snapshots
>> > > > system_u:object_r:cgroup_t:s0 /sys/fs/cgroup/system.slice/.snapshots
>> > >
>> > > --
>> > > gpg --locate-keys dominick.grift@defensec.nl
>> > > Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
>> > > Dominick Grift
>> >
>> > Debian sid (Linux debianBullseye 6.1.0-6-amd64 #1 SMP PREEMPT_DYNAMIC
>> > Debian 6.1.15-1 (2023-03-05) x86_64 GNU/Linux):
>> >
>> > type cgroup_test_t;
>> > allow cgroup_test_t cgroup_t:filesystem associate;
>> > filetrans_pattern(sysadm_t, cgroup_t, cgroup_test_t, dir, "testdir")
>> > allow sysadm_t cgroup_test_t:dir { create_dir_perms list_dir_perms };
>> > allow sysadm_t cgroup_test_t:file getattr;
>> >
>> >
>> > $ seinfo --all | grep cgroup
>> > genfscon cgroup /  system_u:object_r:cgroup_t:s0
>> > genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
>> > genfscon proc /cgroups  system_u:object_r:proc_info_t:s0
>> > cgroup_seclabel
>> > cgroup_t
>> > cgroup_test_t
>> > systemd_cgroups_agent_exec_t
>> > systemd_cgroups_agent_runtime_t
>> > systemd_cgroups_agent_t
>> >
>> >
>> > $ grep cgroup /etc/selinux/debian/contexts/files/file_contexts
>> > /cgroup/.*              <<none>>
>> > /sys/fs/cgroup/.*               <<none>>
>> > /sys/fs/cgroup/[^/]+            -l      system_u:object_r:cgroup_t:s0
>> > /cgroup         -d      system_u:object_r:cgroup_t:s0
>> > /sys/fs/cgroup          -d      system_u:object_r:cgroup_t:s0
>> > /usr/lib/systemd/systemd-cgroups-agent          --
>> > system_u:object_r:systemd_cgroups_agent_exec_t:s0
>> >
>> >
>> > $ mkdir /sys/fs/cgroup/system.slice/testdir
>> > $ ls -laZ /sys/fs/cgroup/system.slice/testdir/
>> > total 0
>> > drwxr-x---.  2 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
>> > .
>> > drwxr-xr-x. 19 root root system_u:object_r:cgroup_t:s0  0 Mar 20 19:19
>> > ..
>> > -r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
>> > cgroup.controllers
>> > -r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
>> > cgroup.events
>>
>> Hmm...I don't get the same result with 6.1.14-200.fc37.x86_64, using
>> the corresponding slightly tweaked policy module:
>> policy_module(cgrouptest, 1.0)
>> require {
>> type cgroup_t;
>> type unconfined_t;
>> }
>> type cgroup_test_t;
>> allow cgroup_test_t cgroup_t:filesystem associate;
>> filetrans_pattern(unconfined_t, cgroup_t, cgroup_test_t, dir, "testdir")
>> allow unconfined_t cgroup_test_t:dir { create_dir_perms list_dir_perms };
>> allow unconfined_t cgroup_test_t:file getattr;
>>
>> That's on Fedora 37, not 34, sorry for the typo.
>
> Ah, now I remembered that we made it such that the transitions would
> only apply if the parent directory has a label explicitly set by
> userspace (via setxattr). Not sure if we can improve it easily, since
> we can't use the normal inode-based logic for cgroupfs (the xattrs are
> stored in kernfs nodes, each of which can be exposed via multiple
> inodes if there is more than one cgroupfs mount).

Thanks. I can confirm that this indeed enabled transition functionality.

It does not solve my memory.pressure challenge but I implementing it
regardless in hopes that it addresses the races I encountered when
solely relying on genfscon for user.slice

https://git.defensec.nl/?p=dssp5.git;a=commitdiff;h=1920c9f751445bfd51f43a7c4e9b7fedda057d15

We should probably document this "gotcha" in the selinux-notebook

-- 
gpg --locate-keys dominick.grift@defensec.nl
Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
Dominick Grift

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-21 14:42                               ` Dominick Grift
@ 2023-03-22 17:07                                 ` Matthew Sheets
  2023-03-22 17:15                                   ` Dominick Grift
  2023-03-22 17:27                                   ` Stephen Smalley
  0 siblings, 2 replies; 29+ messages in thread
From: Matthew Sheets @ 2023-03-22 17:07 UTC (permalink / raw)
  To: Dominick Grift, Ondrej Mosnacek
  Cc: Stephen Smalley, Christian Göttsche, Paul Moore, selinux

On 3/21/2023 7:42 AM, Dominick Grift wrote:
> Ondrej Mosnacek <omosnace@redhat.com> writes:
> 
>> On Mon, Mar 20, 2023 at 9:23 PM Stephen Smalley
>> <stephen.smalley.work@gmail.com> wrote:
>>>
>>> On Mon, Mar 20, 2023 at 2:22 PM Christian Göttsche
>>> <cgzones@googlemail.com> wrote:
>>>>
>>>> On Mon, 20 Mar 2023 at 19:14, Dominick Grift <dominick.grift@defensec.nl> wrote:
>>>>>
>>>>> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
>>>>>
>>>>>> On Mon, Mar 20, 2023 at 1:28 PM Stephen Smalley
>>>>>> <stephen.smalley.work@gmail.com> wrote:
>>>>>>> Hmm...that's interesting. I just tried in Fedora using one of the
>>>>>>> type_transitions already defined in the default policy and although it
>>>>>>> appears to use the type_transition to compute the new SID for the
>>>>>>> create check, ls -Z of the file after creation showed it labeled
>>>>>>> cgroup_t instead. So it doesn't appear to be working or I am doing it
>>>>>>> wrong.
>>>>>
>>>>> I am totally confused now as well because Christian on IRC say's it
>>>>> works for him but I cannot get it to work here and I tried various
>>>>> combinations
>>>>>
>>>>>>
>>>>>> Reproducer, on F34,
>>>>>> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
>>>>>> mkdir: cannot create directory
>>>>>> ‘/sys/fs/cgroup/system.slice/.snapshots’: Permission denied
>>>>>> $ sudo ausearch -m AVC -ts recent -i
>>>>>> ----
>>>>>> type=AVC msg=audit(03/20/2023 13:00:04.699:47156) : avc:  denied  {
>>>>>> associate } for  pid=152325 comm=mkdir name=.snapshots
>>>>>> scontext=unconfined_u:object_r:snapperd_data_t:s0
>>>>>> tcontext=system_u:object_r:cgroup_t:s0 tclass=filesystem permissive=0
>>>>>> $ seinfo --fs_use | grep cgroup
>>>>>> $ seinfo --genfscon | grep cgroup
>>>>>>     genfscon cgroup /  system_u:object_r:cgroup_t:s0
>>>>>>     genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
>>>>>> $ sesearch -T -s unconfined_t -t cgroup_t -c dir
>>>>>> type_transition unconfined_t cgroup_t:dir snapperd_data_t .snapshots
>>>>>> $ sudo setenforce 0
>>>>>> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
>>>>>> $ ls -Zd /sys/fs/cgroup/system.slice/.snapshots
>>>>>> system_u:object_r:cgroup_t:s0 /sys/fs/cgroup/system.slice/.snapshots
>>>>>
>>>>> --
>>>>> gpg --locate-keys dominick.grift@defensec.nl
>>>>> Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
>>>>> Dominick Grift
>>>>
>>>> Debian sid (Linux debianBullseye 6.1.0-6-amd64 #1 SMP PREEMPT_DYNAMIC
>>>> Debian 6.1.15-1 (2023-03-05) x86_64 GNU/Linux):
>>>>
>>>> type cgroup_test_t;
>>>> allow cgroup_test_t cgroup_t:filesystem associate;
>>>> filetrans_pattern(sysadm_t, cgroup_t, cgroup_test_t, dir, "testdir")
>>>> allow sysadm_t cgroup_test_t:dir { create_dir_perms list_dir_perms };
>>>> allow sysadm_t cgroup_test_t:file getattr;
>>>>
>>>>
>>>> $ seinfo --all | grep cgroup
>>>> genfscon cgroup /  system_u:object_r:cgroup_t:s0
>>>> genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
>>>> genfscon proc /cgroups  system_u:object_r:proc_info_t:s0
>>>> cgroup_seclabel
>>>> cgroup_t
>>>> cgroup_test_t
>>>> systemd_cgroups_agent_exec_t
>>>> systemd_cgroups_agent_runtime_t
>>>> systemd_cgroups_agent_t
>>>>
>>>>
>>>> $ grep cgroup /etc/selinux/debian/contexts/files/file_contexts
>>>> /cgroup/.*              <<none>>
>>>> /sys/fs/cgroup/.*               <<none>>
>>>> /sys/fs/cgroup/[^/]+            -l      system_u:object_r:cgroup_t:s0
>>>> /cgroup         -d      system_u:object_r:cgroup_t:s0
>>>> /sys/fs/cgroup          -d      system_u:object_r:cgroup_t:s0
>>>> /usr/lib/systemd/systemd-cgroups-agent          --
>>>> system_u:object_r:systemd_cgroups_agent_exec_t:s0
>>>>
>>>>
>>>> $ mkdir /sys/fs/cgroup/system.slice/testdir
>>>> $ ls -laZ /sys/fs/cgroup/system.slice/testdir/
>>>> total 0
>>>> drwxr-x---.  2 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
>>>> .
>>>> drwxr-xr-x. 19 root root system_u:object_r:cgroup_t:s0  0 Mar 20 19:19
>>>> ..
>>>> -r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
>>>> cgroup.controllers
>>>> -r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
>>>> cgroup.events
>>>
>>> Hmm...I don't get the same result with 6.1.14-200.fc37.x86_64, using
>>> the corresponding slightly tweaked policy module:
>>> policy_module(cgrouptest, 1.0)
>>> require {
>>> type cgroup_t;
>>> type unconfined_t;
>>> }
>>> type cgroup_test_t;
>>> allow cgroup_test_t cgroup_t:filesystem associate;
>>> filetrans_pattern(unconfined_t, cgroup_t, cgroup_test_t, dir, "testdir")
>>> allow unconfined_t cgroup_test_t:dir { create_dir_perms list_dir_perms };
>>> allow unconfined_t cgroup_test_t:file getattr;
>>>
>>> That's on Fedora 37, not 34, sorry for the typo.
>>
>> Ah, now I remembered that we made it such that the transitions would
>> only apply if the parent directory has a label explicitly set by
>> userspace (via setxattr). Not sure if we can improve it easily, since
>> we can't use the normal inode-based logic for cgroupfs (the xattrs are
>> stored in kernfs nodes, each of which can be exposed via multiple
>> inodes if there is more than one cgroupfs mount).
> 
> Thanks. I can confirm that this indeed enabled transition functionality.
> 
> It does not solve my memory.pressure challenge but I implementing it
> regardless in hopes that it addresses the races I encountered when
> solely relying on genfscon for user.slice
> 
> https://git.defensec.nl/?p=dssp5.git;a=commitdiff;h=1920c9f751445bfd51f43a7c4e9b7fedda057d15
> 
> We should probably document this "gotcha" in the selinux-notebook
> 

Just to unify some other threads of conversation that has been going on 
for this.

I helped the author of the initial PR that started this discussion.  We 
knew we needed a new unique label and I suggested that we try a named 
file trans pattern from init_t just to see if it works, and it seemed to 
right out of the gates.  We didn't need to flip any other switches on 
our test environment.

Here is an example of an AVC we are seeing:
AVC avc:  denied  { getattr } for  pid=5953 comm="systemd" 
path="/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/memory.pressure" 
dev="cgroup2" ino=27721 scontext=unconfined_u:unconfined_r:unconfined_t 
tcontext=system_u:object_r:memory_pressure_t tclass=file permissive=0

I do fear there is something different from the other folks that have 
tested this and our setup, since out setup is fairly bespoke compared to 
your standard Linux distro.  But off the top of my head I don't know any 
special setting we would have in place to make this work.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-22 17:07                                 ` Matthew Sheets
@ 2023-03-22 17:15                                   ` Dominick Grift
  2023-03-22 17:27                                   ` Stephen Smalley
  1 sibling, 0 replies; 29+ messages in thread
From: Dominick Grift @ 2023-03-22 17:15 UTC (permalink / raw)
  To: Matthew Sheets
  Cc: Ondrej Mosnacek, Stephen Smalley, Christian Göttsche,
	Paul Moore, selinux

Matthew Sheets <masheets@linux.microsoft.com> writes:

> On 3/21/2023 7:42 AM, Dominick Grift wrote:
>> Ondrej Mosnacek <omosnace@redhat.com> writes:
>> 
>>> On Mon, Mar 20, 2023 at 9:23 PM Stephen Smalley
>>> <stephen.smalley.work@gmail.com> wrote:
>>>>
>>>> On Mon, Mar 20, 2023 at 2:22 PM Christian Göttsche
>>>> <cgzones@googlemail.com> wrote:
>>>>>
>>>>> On Mon, 20 Mar 2023 at 19:14, Dominick Grift <dominick.grift@defensec.nl> wrote:
>>>>>>
>>>>>> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
>>>>>>
>>>>>>> On Mon, Mar 20, 2023 at 1:28 PM Stephen Smalley
>>>>>>> <stephen.smalley.work@gmail.com> wrote:
>>>>>>>> Hmm...that's interesting. I just tried in Fedora using one of the
>>>>>>>> type_transitions already defined in the default policy and although it
>>>>>>>> appears to use the type_transition to compute the new SID for the
>>>>>>>> create check, ls -Z of the file after creation showed it labeled
>>>>>>>> cgroup_t instead. So it doesn't appear to be working or I am doing it
>>>>>>>> wrong.
>>>>>>
>>>>>> I am totally confused now as well because Christian on IRC say's it
>>>>>> works for him but I cannot get it to work here and I tried various
>>>>>> combinations
>>>>>>
>>>>>>>
>>>>>>> Reproducer, on F34,
>>>>>>> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
>>>>>>> mkdir: cannot create directory
>>>>>>> ‘/sys/fs/cgroup/system.slice/.snapshots’: Permission denied
>>>>>>> $ sudo ausearch -m AVC -ts recent -i
>>>>>>> ----
>>>>>>> type=AVC msg=audit(03/20/2023 13:00:04.699:47156) : avc:  denied  {
>>>>>>> associate } for  pid=152325 comm=mkdir name=.snapshots
>>>>>>> scontext=unconfined_u:object_r:snapperd_data_t:s0
>>>>>>> tcontext=system_u:object_r:cgroup_t:s0 tclass=filesystem permissive=0
>>>>>>> $ seinfo --fs_use | grep cgroup
>>>>>>> $ seinfo --genfscon | grep cgroup
>>>>>>>     genfscon cgroup /  system_u:object_r:cgroup_t:s0
>>>>>>>     genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
>>>>>>> $ sesearch -T -s unconfined_t -t cgroup_t -c dir
>>>>>>> type_transition unconfined_t cgroup_t:dir snapperd_data_t .snapshots
>>>>>>> $ sudo setenforce 0
>>>>>>> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
>>>>>>> $ ls -Zd /sys/fs/cgroup/system.slice/.snapshots
>>>>>>> system_u:object_r:cgroup_t:s0 /sys/fs/cgroup/system.slice/.snapshots
>>>>>>
>>>>>> --
>>>>>> gpg --locate-keys dominick.grift@defensec.nl
>>>>>> Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
>>>>>> Dominick Grift
>>>>>
>>>>> Debian sid (Linux debianBullseye 6.1.0-6-amd64 #1 SMP PREEMPT_DYNAMIC
>>>>> Debian 6.1.15-1 (2023-03-05) x86_64 GNU/Linux):
>>>>>
>>>>> type cgroup_test_t;
>>>>> allow cgroup_test_t cgroup_t:filesystem associate;
>>>>> filetrans_pattern(sysadm_t, cgroup_t, cgroup_test_t, dir, "testdir")
>>>>> allow sysadm_t cgroup_test_t:dir { create_dir_perms list_dir_perms };
>>>>> allow sysadm_t cgroup_test_t:file getattr;
>>>>>
>>>>>
>>>>> $ seinfo --all | grep cgroup
>>>>> genfscon cgroup /  system_u:object_r:cgroup_t:s0
>>>>> genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
>>>>> genfscon proc /cgroups  system_u:object_r:proc_info_t:s0
>>>>> cgroup_seclabel
>>>>> cgroup_t
>>>>> cgroup_test_t
>>>>> systemd_cgroups_agent_exec_t
>>>>> systemd_cgroups_agent_runtime_t
>>>>> systemd_cgroups_agent_t
>>>>>
>>>>>
>>>>> $ grep cgroup /etc/selinux/debian/contexts/files/file_contexts
>>>>> /cgroup/.*              <<none>>
>>>>> /sys/fs/cgroup/.*               <<none>>
>>>>> /sys/fs/cgroup/[^/]+            -l      system_u:object_r:cgroup_t:s0
>>>>> /cgroup         -d      system_u:object_r:cgroup_t:s0
>>>>> /sys/fs/cgroup          -d      system_u:object_r:cgroup_t:s0
>>>>> /usr/lib/systemd/systemd-cgroups-agent          --
>>>>> system_u:object_r:systemd_cgroups_agent_exec_t:s0
>>>>>
>>>>>
>>>>> $ mkdir /sys/fs/cgroup/system.slice/testdir
>>>>> $ ls -laZ /sys/fs/cgroup/system.slice/testdir/
>>>>> total 0
>>>>> drwxr-x---.  2 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
>>>>> .
>>>>> drwxr-xr-x. 19 root root system_u:object_r:cgroup_t:s0  0 Mar 20 19:19
>>>>> ..
>>>>> -r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
>>>>> cgroup.controllers
>>>>> -r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
>>>>> cgroup.events
>>>>
>>>> Hmm...I don't get the same result with 6.1.14-200.fc37.x86_64, using
>>>> the corresponding slightly tweaked policy module:
>>>> policy_module(cgrouptest, 1.0)
>>>> require {
>>>> type cgroup_t;
>>>> type unconfined_t;
>>>> }
>>>> type cgroup_test_t;
>>>> allow cgroup_test_t cgroup_t:filesystem associate;
>>>> filetrans_pattern(unconfined_t, cgroup_t, cgroup_test_t, dir, "testdir")
>>>> allow unconfined_t cgroup_test_t:dir { create_dir_perms list_dir_perms };
>>>> allow unconfined_t cgroup_test_t:file getattr;
>>>>
>>>> That's on Fedora 37, not 34, sorry for the typo.
>>>
>>> Ah, now I remembered that we made it such that the transitions would
>>> only apply if the parent directory has a label explicitly set by
>>> userspace (via setxattr). Not sure if we can improve it easily, since
>>> we can't use the normal inode-based logic for cgroupfs (the xattrs are
>>> stored in kernfs nodes, each of which can be exposed via multiple
>>> inodes if there is more than one cgroupfs mount).
>> Thanks. I can confirm that this indeed enabled transition
>> functionality.
>> It does not solve my memory.pressure challenge but I implementing it
>> regardless in hopes that it addresses the races I encountered when
>> solely relying on genfscon for user.slice
>> https://git.defensec.nl/?p=dssp5.git;a=commitdiff;h=1920c9f751445bfd51f43a7c4e9b7fedda057d15
>> We should probably document this "gotcha" in the selinux-notebook
>> 
>
> Just to unify some other threads of conversation that has been going
> on for this.
>
> I helped the author of the initial PR that started this discussion.
> We knew we needed a new unique label and I suggested that we try a
> named file trans pattern from init_t just to see if it works, and it
> seemed to right out of the gates.  We didn't need to flip any other
> switches on our test environment.
>
> Here is an example of an AVC we are seeing:
> AVC avc:  denied  { getattr } for  pid=5953 comm="systemd"
> path="/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/memory.pressure"
> dev="cgroup2" ino=27721
> scontext=unconfined_u:unconfined_r:unconfined_t
> tcontext=system_u:object_r:memory_pressure_t tclass=file permissive=0
>
> I do fear there is something different from the other folks that have
> tested this and our setup, since out setup is fairly bespoke compared
> to your standard Linux distro.  But off the top of my head I don't
> know any special setting we would have in place to make this work.
>

I just retested it here and i can't get it to work. (but i might be
overlooking something)

recorded the test for people interested:

https://www.defensec.nl/~kcinimod/stuff/cgroup8.mp4

-- 
gpg --locate-keys dominick.grift@defensec.nl
Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
Dominick Grift

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-22 17:07                                 ` Matthew Sheets
  2023-03-22 17:15                                   ` Dominick Grift
@ 2023-03-22 17:27                                   ` Stephen Smalley
  2023-03-23 13:55                                     ` Matthew Sheets
  1 sibling, 1 reply; 29+ messages in thread
From: Stephen Smalley @ 2023-03-22 17:27 UTC (permalink / raw)
  To: Matthew Sheets
  Cc: Dominick Grift, Ondrej Mosnacek, Christian Göttsche,
	Paul Moore, selinux

On Wed, Mar 22, 2023 at 1:07 PM Matthew Sheets
<masheets@linux.microsoft.com> wrote:
>
> On 3/21/2023 7:42 AM, Dominick Grift wrote:
> > Ondrej Mosnacek <omosnace@redhat.com> writes:
> >
> >> On Mon, Mar 20, 2023 at 9:23 PM Stephen Smalley
> >> <stephen.smalley.work@gmail.com> wrote:
> >>>
> >>> On Mon, Mar 20, 2023 at 2:22 PM Christian Göttsche
> >>> <cgzones@googlemail.com> wrote:
> >>>>
> >>>> On Mon, 20 Mar 2023 at 19:14, Dominick Grift <dominick.grift@defensec.nl> wrote:
> >>>>>
> >>>>> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
> >>>>>
> >>>>>> On Mon, Mar 20, 2023 at 1:28 PM Stephen Smalley
> >>>>>> <stephen.smalley.work@gmail.com> wrote:
> >>>>>>> Hmm...that's interesting. I just tried in Fedora using one of the
> >>>>>>> type_transitions already defined in the default policy and although it
> >>>>>>> appears to use the type_transition to compute the new SID for the
> >>>>>>> create check, ls -Z of the file after creation showed it labeled
> >>>>>>> cgroup_t instead. So it doesn't appear to be working or I am doing it
> >>>>>>> wrong.
> >>>>>
> >>>>> I am totally confused now as well because Christian on IRC say's it
> >>>>> works for him but I cannot get it to work here and I tried various
> >>>>> combinations
> >>>>>
> >>>>>>
> >>>>>> Reproducer, on F34,
> >>>>>> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
> >>>>>> mkdir: cannot create directory
> >>>>>> ‘/sys/fs/cgroup/system.slice/.snapshots’: Permission denied
> >>>>>> $ sudo ausearch -m AVC -ts recent -i
> >>>>>> ----
> >>>>>> type=AVC msg=audit(03/20/2023 13:00:04.699:47156) : avc:  denied  {
> >>>>>> associate } for  pid=152325 comm=mkdir name=.snapshots
> >>>>>> scontext=unconfined_u:object_r:snapperd_data_t:s0
> >>>>>> tcontext=system_u:object_r:cgroup_t:s0 tclass=filesystem permissive=0
> >>>>>> $ seinfo --fs_use | grep cgroup
> >>>>>> $ seinfo --genfscon | grep cgroup
> >>>>>>     genfscon cgroup /  system_u:object_r:cgroup_t:s0
> >>>>>>     genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
> >>>>>> $ sesearch -T -s unconfined_t -t cgroup_t -c dir
> >>>>>> type_transition unconfined_t cgroup_t:dir snapperd_data_t .snapshots
> >>>>>> $ sudo setenforce 0
> >>>>>> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
> >>>>>> $ ls -Zd /sys/fs/cgroup/system.slice/.snapshots
> >>>>>> system_u:object_r:cgroup_t:s0 /sys/fs/cgroup/system.slice/.snapshots
> >>>>>
> >>>>> --
> >>>>> gpg --locate-keys dominick.grift@defensec.nl
> >>>>> Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
> >>>>> Dominick Grift
> >>>>
> >>>> Debian sid (Linux debianBullseye 6.1.0-6-amd64 #1 SMP PREEMPT_DYNAMIC
> >>>> Debian 6.1.15-1 (2023-03-05) x86_64 GNU/Linux):
> >>>>
> >>>> type cgroup_test_t;
> >>>> allow cgroup_test_t cgroup_t:filesystem associate;
> >>>> filetrans_pattern(sysadm_t, cgroup_t, cgroup_test_t, dir, "testdir")
> >>>> allow sysadm_t cgroup_test_t:dir { create_dir_perms list_dir_perms };
> >>>> allow sysadm_t cgroup_test_t:file getattr;
> >>>>
> >>>>
> >>>> $ seinfo --all | grep cgroup
> >>>> genfscon cgroup /  system_u:object_r:cgroup_t:s0
> >>>> genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
> >>>> genfscon proc /cgroups  system_u:object_r:proc_info_t:s0
> >>>> cgroup_seclabel
> >>>> cgroup_t
> >>>> cgroup_test_t
> >>>> systemd_cgroups_agent_exec_t
> >>>> systemd_cgroups_agent_runtime_t
> >>>> systemd_cgroups_agent_t
> >>>>
> >>>>
> >>>> $ grep cgroup /etc/selinux/debian/contexts/files/file_contexts
> >>>> /cgroup/.*              <<none>>
> >>>> /sys/fs/cgroup/.*               <<none>>
> >>>> /sys/fs/cgroup/[^/]+            -l      system_u:object_r:cgroup_t:s0
> >>>> /cgroup         -d      system_u:object_r:cgroup_t:s0
> >>>> /sys/fs/cgroup          -d      system_u:object_r:cgroup_t:s0
> >>>> /usr/lib/systemd/systemd-cgroups-agent          --
> >>>> system_u:object_r:systemd_cgroups_agent_exec_t:s0
> >>>>
> >>>>
> >>>> $ mkdir /sys/fs/cgroup/system.slice/testdir
> >>>> $ ls -laZ /sys/fs/cgroup/system.slice/testdir/
> >>>> total 0
> >>>> drwxr-x---.  2 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
> >>>> .
> >>>> drwxr-xr-x. 19 root root system_u:object_r:cgroup_t:s0  0 Mar 20 19:19
> >>>> ..
> >>>> -r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
> >>>> cgroup.controllers
> >>>> -r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
> >>>> cgroup.events
> >>>
> >>> Hmm...I don't get the same result with 6.1.14-200.fc37.x86_64, using
> >>> the corresponding slightly tweaked policy module:
> >>> policy_module(cgrouptest, 1.0)
> >>> require {
> >>> type cgroup_t;
> >>> type unconfined_t;
> >>> }
> >>> type cgroup_test_t;
> >>> allow cgroup_test_t cgroup_t:filesystem associate;
> >>> filetrans_pattern(unconfined_t, cgroup_t, cgroup_test_t, dir, "testdir")
> >>> allow unconfined_t cgroup_test_t:dir { create_dir_perms list_dir_perms };
> >>> allow unconfined_t cgroup_test_t:file getattr;
> >>>
> >>> That's on Fedora 37, not 34, sorry for the typo.
> >>
> >> Ah, now I remembered that we made it such that the transitions would
> >> only apply if the parent directory has a label explicitly set by
> >> userspace (via setxattr). Not sure if we can improve it easily, since
> >> we can't use the normal inode-based logic for cgroupfs (the xattrs are
> >> stored in kernfs nodes, each of which can be exposed via multiple
> >> inodes if there is more than one cgroupfs mount).
> >
> > Thanks. I can confirm that this indeed enabled transition functionality.
> >
> > It does not solve my memory.pressure challenge but I implementing it
> > regardless in hopes that it addresses the races I encountered when
> > solely relying on genfscon for user.slice
> >
> > https://git.defensec.nl/?p=dssp5.git;a=commitdiff;h=1920c9f751445bfd51f43a7c4e9b7fedda057d15
> >
> > We should probably document this "gotcha" in the selinux-notebook
> >
>
> Just to unify some other threads of conversation that has been going on
> for this.
>
> I helped the author of the initial PR that started this discussion.  We
> knew we needed a new unique label and I suggested that we try a named
> file trans pattern from init_t just to see if it works, and it seemed to
> right out of the gates.  We didn't need to flip any other switches on
> our test environment.
>
> Here is an example of an AVC we are seeing:
> AVC avc:  denied  { getattr } for  pid=5953 comm="systemd"
> path="/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/memory.pressure"
> dev="cgroup2" ino=27721 scontext=unconfined_u:unconfined_r:unconfined_t
> tcontext=system_u:object_r:memory_pressure_t tclass=file permissive=0
>
> I do fear there is something different from the other folks that have
> tested this and our setup, since out setup is fairly bespoke compared to
> your standard Linux distro.  But off the top of my head I don't know any
> special setting we would have in place to make this work.

Questions:
- Did systemd or some other userspace process first set the context of
/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service
explicitly?
- Could you post the exact type_transition rule(s) from your policy,
e.g. sesearch -T -s unconfined_t -D memory_pressure_t?
- Does ls -Z of the file also report that context?
- Kernel version?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-22 17:27                                   ` Stephen Smalley
@ 2023-03-23 13:55                                     ` Matthew Sheets
  2023-03-23 14:42                                       ` Matthew Sheets
  2023-03-23 16:56                                       ` Stephen Smalley
  0 siblings, 2 replies; 29+ messages in thread
From: Matthew Sheets @ 2023-03-23 13:55 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Dominick Grift, Ondrej Mosnacek, Christian Göttsche,
	Paul Moore, selinux



On 3/22/2023 10:27 AM, Stephen Smalley wrote:
> On Wed, Mar 22, 2023 at 1:07 PM Matthew Sheets
> <masheets@linux.microsoft.com> wrote:
>>
>> On 3/21/2023 7:42 AM, Dominick Grift wrote:
>>> Ondrej Mosnacek <omosnace@redhat.com> writes:
>>>
>>>> On Mon, Mar 20, 2023 at 9:23 PM Stephen Smalley
>>>> <stephen.smalley.work@gmail.com> wrote:
>>>>>
>>>>> On Mon, Mar 20, 2023 at 2:22 PM Christian Göttsche
>>>>> <cgzones@googlemail.com> wrote:
>>>>>>
>>>>>> On Mon, 20 Mar 2023 at 19:14, Dominick Grift <dominick.grift@defensec.nl> wrote:
>>>>>>>
>>>>>>> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
>>>>>>>
>>>>>>>> On Mon, Mar 20, 2023 at 1:28 PM Stephen Smalley
>>>>>>>> <stephen.smalley.work@gmail.com> wrote:
>>>>>>>>> Hmm...that's interesting. I just tried in Fedora using one of the
>>>>>>>>> type_transitions already defined in the default policy and although it
>>>>>>>>> appears to use the type_transition to compute the new SID for the
>>>>>>>>> create check, ls -Z of the file after creation showed it labeled
>>>>>>>>> cgroup_t instead. So it doesn't appear to be working or I am doing it
>>>>>>>>> wrong.
>>>>>>>
>>>>>>> I am totally confused now as well because Christian on IRC say's it
>>>>>>> works for him but I cannot get it to work here and I tried various
>>>>>>> combinations
>>>>>>>
>>>>>>>>
>>>>>>>> Reproducer, on F34,
>>>>>>>> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
>>>>>>>> mkdir: cannot create directory
>>>>>>>> ‘/sys/fs/cgroup/system.slice/.snapshots’: Permission denied
>>>>>>>> $ sudo ausearch -m AVC -ts recent -i
>>>>>>>> ----
>>>>>>>> type=AVC msg=audit(03/20/2023 13:00:04.699:47156) : avc:  denied  {
>>>>>>>> associate } for  pid=152325 comm=mkdir name=.snapshots
>>>>>>>> scontext=unconfined_u:object_r:snapperd_data_t:s0
>>>>>>>> tcontext=system_u:object_r:cgroup_t:s0 tclass=filesystem permissive=0
>>>>>>>> $ seinfo --fs_use | grep cgroup
>>>>>>>> $ seinfo --genfscon | grep cgroup
>>>>>>>>      genfscon cgroup /  system_u:object_r:cgroup_t:s0
>>>>>>>>      genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
>>>>>>>> $ sesearch -T -s unconfined_t -t cgroup_t -c dir
>>>>>>>> type_transition unconfined_t cgroup_t:dir snapperd_data_t .snapshots
>>>>>>>> $ sudo setenforce 0
>>>>>>>> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
>>>>>>>> $ ls -Zd /sys/fs/cgroup/system.slice/.snapshots
>>>>>>>> system_u:object_r:cgroup_t:s0 /sys/fs/cgroup/system.slice/.snapshots
>>>>>>>
>>>>>>> --
>>>>>>> gpg --locate-keys dominick.grift@defensec.nl
>>>>>>> Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
>>>>>>> Dominick Grift
>>>>>>
>>>>>> Debian sid (Linux debianBullseye 6.1.0-6-amd64 #1 SMP PREEMPT_DYNAMIC
>>>>>> Debian 6.1.15-1 (2023-03-05) x86_64 GNU/Linux):
>>>>>>
>>>>>> type cgroup_test_t;
>>>>>> allow cgroup_test_t cgroup_t:filesystem associate;
>>>>>> filetrans_pattern(sysadm_t, cgroup_t, cgroup_test_t, dir, "testdir")
>>>>>> allow sysadm_t cgroup_test_t:dir { create_dir_perms list_dir_perms };
>>>>>> allow sysadm_t cgroup_test_t:file getattr;
>>>>>>
>>>>>>
>>>>>> $ seinfo --all | grep cgroup
>>>>>> genfscon cgroup /  system_u:object_r:cgroup_t:s0
>>>>>> genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
>>>>>> genfscon proc /cgroups  system_u:object_r:proc_info_t:s0
>>>>>> cgroup_seclabel
>>>>>> cgroup_t
>>>>>> cgroup_test_t
>>>>>> systemd_cgroups_agent_exec_t
>>>>>> systemd_cgroups_agent_runtime_t
>>>>>> systemd_cgroups_agent_t
>>>>>>
>>>>>>
>>>>>> $ grep cgroup /etc/selinux/debian/contexts/files/file_contexts
>>>>>> /cgroup/.*              <<none>>
>>>>>> /sys/fs/cgroup/.*               <<none>>
>>>>>> /sys/fs/cgroup/[^/]+            -l      system_u:object_r:cgroup_t:s0
>>>>>> /cgroup         -d      system_u:object_r:cgroup_t:s0
>>>>>> /sys/fs/cgroup          -d      system_u:object_r:cgroup_t:s0
>>>>>> /usr/lib/systemd/systemd-cgroups-agent          --
>>>>>> system_u:object_r:systemd_cgroups_agent_exec_t:s0
>>>>>>
>>>>>>
>>>>>> $ mkdir /sys/fs/cgroup/system.slice/testdir
>>>>>> $ ls -laZ /sys/fs/cgroup/system.slice/testdir/
>>>>>> total 0
>>>>>> drwxr-x---.  2 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
>>>>>> .
>>>>>> drwxr-xr-x. 19 root root system_u:object_r:cgroup_t:s0  0 Mar 20 19:19
>>>>>> ..
>>>>>> -r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
>>>>>> cgroup.controllers
>>>>>> -r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 19:19
>>>>>> cgroup.events
>>>>>
>>>>> Hmm...I don't get the same result with 6.1.14-200.fc37.x86_64, using
>>>>> the corresponding slightly tweaked policy module:
>>>>> policy_module(cgrouptest, 1.0)
>>>>> require {
>>>>> type cgroup_t;
>>>>> type unconfined_t;
>>>>> }
>>>>> type cgroup_test_t;
>>>>> allow cgroup_test_t cgroup_t:filesystem associate;
>>>>> filetrans_pattern(unconfined_t, cgroup_t, cgroup_test_t, dir, "testdir")
>>>>> allow unconfined_t cgroup_test_t:dir { create_dir_perms list_dir_perms };
>>>>> allow unconfined_t cgroup_test_t:file getattr;
>>>>>
>>>>> That's on Fedora 37, not 34, sorry for the typo.
>>>>
>>>> Ah, now I remembered that we made it such that the transitions would
>>>> only apply if the parent directory has a label explicitly set by
>>>> userspace (via setxattr). Not sure if we can improve it easily, since
>>>> we can't use the normal inode-based logic for cgroupfs (the xattrs are
>>>> stored in kernfs nodes, each of which can be exposed via multiple
>>>> inodes if there is more than one cgroupfs mount).
>>>
>>> Thanks. I can confirm that this indeed enabled transition functionality.
>>>
>>> It does not solve my memory.pressure challenge but I implementing it
>>> regardless in hopes that it addresses the races I encountered when
>>> solely relying on genfscon for user.slice
>>>
>>> https://git.defensec.nl/?p=dssp5.git;a=commitdiff;h=1920c9f751445bfd51f43a7c4e9b7fedda057d15
>>>
>>> We should probably document this "gotcha" in the selinux-notebook
>>>
>>
>> Just to unify some other threads of conversation that has been going on
>> for this.
>>
>> I helped the author of the initial PR that started this discussion.  We
>> knew we needed a new unique label and I suggested that we try a named
>> file trans pattern from init_t just to see if it works, and it seemed to
>> right out of the gates.  We didn't need to flip any other switches on
>> our test environment.
>>
>> Here is an example of an AVC we are seeing:
>> AVC avc:  denied  { getattr } for  pid=5953 comm="systemd"
>> path="/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/memory.pressure"
>> dev="cgroup2" ino=27721 scontext=unconfined_u:unconfined_r:unconfined_t
>> tcontext=system_u:object_r:memory_pressure_t tclass=file permissive=0
>>
>> I do fear there is something different from the other folks that have
>> tested this and our setup, since out setup is fairly bespoke compared to
>> your standard Linux distro.  But off the top of my head I don't know any
>> special setting we would have in place to make this work.
> 
> Questions:
> - Did systemd or some other userspace process first set the context of
> /sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service
> explicitly?
> - Could you post the exact type_transition rule(s) from your policy,
> e.g. sesearch -T -s unconfined_t -D memory_pressure_t?
> - Does ls -Z of the file also report that context?
> - Kernel version?

1. We believe it is systemd.  At the very least its nothing we are
    directly doing.
2. type_transition init_t cgroup_t:file memory_pressure_t memory.pressure;
    In the above example unconfined_t was just trying to access it but
    we have the trans coming from init_t
3. Yes ls -Z shows the proper context as well.
4. For this specific test it was 5.10.154 but we have 5.10.x in some
    of our other testing environments.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-23 13:55                                     ` Matthew Sheets
@ 2023-03-23 14:42                                       ` Matthew Sheets
  2023-03-23 14:53                                         ` Dominick Grift
  2023-03-23 16:56                                       ` Stephen Smalley
  1 sibling, 1 reply; 29+ messages in thread
From: Matthew Sheets @ 2023-03-23 14:42 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Dominick Grift, Ondrej Mosnacek, Christian Göttsche,
	Paul Moore, selinux



On 3/23/2023 6:55 AM, Matthew Sheets wrote:
> 
> 
> On 3/22/2023 10:27 AM, Stephen Smalley wrote:
>> On Wed, Mar 22, 2023 at 1:07 PM Matthew Sheets
>> <masheets@linux.microsoft.com> wrote:
>>>
>>> On 3/21/2023 7:42 AM, Dominick Grift wrote:
>>>> Ondrej Mosnacek <omosnace@redhat.com> writes:
>>>>
>>>>> On Mon, Mar 20, 2023 at 9:23 PM Stephen Smalley
>>>>> <stephen.smalley.work@gmail.com> wrote:
>>>>>>
>>>>>> On Mon, Mar 20, 2023 at 2:22 PM Christian Göttsche
>>>>>> <cgzones@googlemail.com> wrote:
>>>>>>>
>>>>>>> On Mon, 20 Mar 2023 at 19:14, Dominick Grift 
>>>>>>> <dominick.grift@defensec.nl> wrote:
>>>>>>>>
>>>>>>>> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
>>>>>>>>
>>>>>>>>> On Mon, Mar 20, 2023 at 1:28 PM Stephen Smalley
>>>>>>>>> <stephen.smalley.work@gmail.com> wrote:
>>>>>>>>>> Hmm...that's interesting. I just tried in Fedora using one of the
>>>>>>>>>> type_transitions already defined in the default policy and 
>>>>>>>>>> although it
>>>>>>>>>> appears to use the type_transition to compute the new SID for the
>>>>>>>>>> create check, ls -Z of the file after creation showed it labeled
>>>>>>>>>> cgroup_t instead. So it doesn't appear to be working or I am 
>>>>>>>>>> doing it
>>>>>>>>>> wrong.
>>>>>>>>
>>>>>>>> I am totally confused now as well because Christian on IRC say's it
>>>>>>>> works for him but I cannot get it to work here and I tried various
>>>>>>>> combinations
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Reproducer, on F34,
>>>>>>>>> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
>>>>>>>>> mkdir: cannot create directory
>>>>>>>>> ‘/sys/fs/cgroup/system.slice/.snapshots’: Permission denied
>>>>>>>>> $ sudo ausearch -m AVC -ts recent -i
>>>>>>>>> ----
>>>>>>>>> type=AVC msg=audit(03/20/2023 13:00:04.699:47156) : avc:  
>>>>>>>>> denied  {
>>>>>>>>> associate } for  pid=152325 comm=mkdir name=.snapshots
>>>>>>>>> scontext=unconfined_u:object_r:snapperd_data_t:s0
>>>>>>>>> tcontext=system_u:object_r:cgroup_t:s0 tclass=filesystem 
>>>>>>>>> permissive=0
>>>>>>>>> $ seinfo --fs_use | grep cgroup
>>>>>>>>> $ seinfo --genfscon | grep cgroup
>>>>>>>>>      genfscon cgroup /  system_u:object_r:cgroup_t:s0
>>>>>>>>>      genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
>>>>>>>>> $ sesearch -T -s unconfined_t -t cgroup_t -c dir
>>>>>>>>> type_transition unconfined_t cgroup_t:dir snapperd_data_t 
>>>>>>>>> .snapshots
>>>>>>>>> $ sudo setenforce 0
>>>>>>>>> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
>>>>>>>>> $ ls -Zd /sys/fs/cgroup/system.slice/.snapshots
>>>>>>>>> system_u:object_r:cgroup_t:s0 
>>>>>>>>> /sys/fs/cgroup/system.slice/.snapshots
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> gpg --locate-keys dominick.grift@defensec.nl
>>>>>>>> Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 
>>>>>>>> 4098
>>>>>>>> Dominick Grift
>>>>>>>
>>>>>>> Debian sid (Linux debianBullseye 6.1.0-6-amd64 #1 SMP 
>>>>>>> PREEMPT_DYNAMIC
>>>>>>> Debian 6.1.15-1 (2023-03-05) x86_64 GNU/Linux):
>>>>>>>
>>>>>>> type cgroup_test_t;
>>>>>>> allow cgroup_test_t cgroup_t:filesystem associate;
>>>>>>> filetrans_pattern(sysadm_t, cgroup_t, cgroup_test_t, dir, "testdir")
>>>>>>> allow sysadm_t cgroup_test_t:dir { create_dir_perms 
>>>>>>> list_dir_perms };
>>>>>>> allow sysadm_t cgroup_test_t:file getattr;
>>>>>>>
>>>>>>>
>>>>>>> $ seinfo --all | grep cgroup
>>>>>>> genfscon cgroup /  system_u:object_r:cgroup_t:s0
>>>>>>> genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
>>>>>>> genfscon proc /cgroups  system_u:object_r:proc_info_t:s0
>>>>>>> cgroup_seclabel
>>>>>>> cgroup_t
>>>>>>> cgroup_test_t
>>>>>>> systemd_cgroups_agent_exec_t
>>>>>>> systemd_cgroups_agent_runtime_t
>>>>>>> systemd_cgroups_agent_t
>>>>>>>
>>>>>>>
>>>>>>> $ grep cgroup /etc/selinux/debian/contexts/files/file_contexts
>>>>>>> /cgroup/.*              <<none>>
>>>>>>> /sys/fs/cgroup/.*               <<none>>
>>>>>>> /sys/fs/cgroup/[^/]+            -l      
>>>>>>> system_u:object_r:cgroup_t:s0
>>>>>>> /cgroup         -d      system_u:object_r:cgroup_t:s0
>>>>>>> /sys/fs/cgroup          -d      system_u:object_r:cgroup_t:s0
>>>>>>> /usr/lib/systemd/systemd-cgroups-agent          --
>>>>>>> system_u:object_r:systemd_cgroups_agent_exec_t:s0
>>>>>>>
>>>>>>>
>>>>>>> $ mkdir /sys/fs/cgroup/system.slice/testdir
>>>>>>> $ ls -laZ /sys/fs/cgroup/system.slice/testdir/
>>>>>>> total 0
>>>>>>> drwxr-x---.  2 root root root:object_r:cgroup_test_t:s0 0 Mar 20 
>>>>>>> 19:19
>>>>>>> .
>>>>>>> drwxr-xr-x. 19 root root system_u:object_r:cgroup_t:s0  0 Mar 20 
>>>>>>> 19:19
>>>>>>> ..
>>>>>>> -r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 
>>>>>>> 19:19
>>>>>>> cgroup.controllers
>>>>>>> -r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar 20 
>>>>>>> 19:19
>>>>>>> cgroup.events
>>>>>>
>>>>>> Hmm...I don't get the same result with 6.1.14-200.fc37.x86_64, using
>>>>>> the corresponding slightly tweaked policy module:
>>>>>> policy_module(cgrouptest, 1.0)
>>>>>> require {
>>>>>> type cgroup_t;
>>>>>> type unconfined_t;
>>>>>> }
>>>>>> type cgroup_test_t;
>>>>>> allow cgroup_test_t cgroup_t:filesystem associate;
>>>>>> filetrans_pattern(unconfined_t, cgroup_t, cgroup_test_t, dir, 
>>>>>> "testdir")
>>>>>> allow unconfined_t cgroup_test_t:dir { create_dir_perms 
>>>>>> list_dir_perms };
>>>>>> allow unconfined_t cgroup_test_t:file getattr;
>>>>>>
>>>>>> That's on Fedora 37, not 34, sorry for the typo.
>>>>>
>>>>> Ah, now I remembered that we made it such that the transitions would
>>>>> only apply if the parent directory has a label explicitly set by
>>>>> userspace (via setxattr). Not sure if we can improve it easily, since
>>>>> we can't use the normal inode-based logic for cgroupfs (the xattrs are
>>>>> stored in kernfs nodes, each of which can be exposed via multiple
>>>>> inodes if there is more than one cgroupfs mount).
>>>>
>>>> Thanks. I can confirm that this indeed enabled transition 
>>>> functionality.
>>>>
>>>> It does not solve my memory.pressure challenge but I implementing it
>>>> regardless in hopes that it addresses the races I encountered when
>>>> solely relying on genfscon for user.slice
>>>>
>>>> https://git.defensec.nl/?p=dssp5.git;a=commitdiff;h=1920c9f751445bfd51f43a7c4e9b7fedda057d15
>>>>
>>>> We should probably document this "gotcha" in the selinux-notebook
>>>>
>>>
>>> Just to unify some other threads of conversation that has been going on
>>> for this.
>>>
>>> I helped the author of the initial PR that started this discussion.  We
>>> knew we needed a new unique label and I suggested that we try a named
>>> file trans pattern from init_t just to see if it works, and it seemed to
>>> right out of the gates.  We didn't need to flip any other switches on
>>> our test environment.
>>>
>>> Here is an example of an AVC we are seeing:
>>> AVC avc:  denied  { getattr } for  pid=5953 comm="systemd"
>>> path="/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/memory.pressure"
>>> dev="cgroup2" ino=27721 scontext=unconfined_u:unconfined_r:unconfined_t
>>> tcontext=system_u:object_r:memory_pressure_t tclass=file permissive=0
>>>
>>> I do fear there is something different from the other folks that have
>>> tested this and our setup, since out setup is fairly bespoke compared to
>>> your standard Linux distro.  But off the top of my head I don't know any
>>> special setting we would have in place to make this work.
>>
>> Questions:
>> - Did systemd or some other userspace process first set the context of
>> /sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service
>> explicitly?
>> - Could you post the exact type_transition rule(s) from your policy,
>> e.g. sesearch -T -s unconfined_t -D memory_pressure_t?
>> - Does ls -Z of the file also report that context?
>> - Kernel version?
> 
> 1. We believe it is systemd.  At the very least its nothing we are
>     directly doing.
> 2. type_transition init_t cgroup_t:file memory_pressure_t memory.pressure;
>     In the above example unconfined_t was just trying to access it but
>     we have the trans coming from init_t
> 3. Yes ls -Z shows the proper context as well.
> 4. For this specific test it was 5.10.154 but we have 5.10.x in some
>     of our other testing environments.

Clarification on 1. I meant to say that we aren't doing anything special
ourselves.  Nothing is being hand labeled unless systemd is doing
something unknown under the hood.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-23 14:42                                       ` Matthew Sheets
@ 2023-03-23 14:53                                         ` Dominick Grift
  0 siblings, 0 replies; 29+ messages in thread
From: Dominick Grift @ 2023-03-23 14:53 UTC (permalink / raw)
  To: Matthew Sheets
  Cc: Stephen Smalley, Ondrej Mosnacek, Christian Göttsche,
	Paul Moore, selinux

Matthew Sheets <masheets@linux.microsoft.com> writes:

> On 3/23/2023 6:55 AM, Matthew Sheets wrote:
>> On 3/22/2023 10:27 AM, Stephen Smalley wrote:
>>> On Wed, Mar 22, 2023 at 1:07 PM Matthew Sheets
>>> <masheets@linux.microsoft.com> wrote:
>>>>
>>>> On 3/21/2023 7:42 AM, Dominick Grift wrote:
>>>>> Ondrej Mosnacek <omosnace@redhat.com> writes:
>>>>>
>>>>>> On Mon, Mar 20, 2023 at 9:23 PM Stephen Smalley
>>>>>> <stephen.smalley.work@gmail.com> wrote:
>>>>>>>
>>>>>>> On Mon, Mar 20, 2023 at 2:22 PM Christian Göttsche
>>>>>>> <cgzones@googlemail.com> wrote:
>>>>>>>>
>>>>>>>> On Mon, 20 Mar 2023 at 19:14, Dominick Grift
>>>>>>>> <dominick.grift@defensec.nl> wrote:
>>>>>>>>>
>>>>>>>>> Stephen Smalley <stephen.smalley.work@gmail.com> writes:
>>>>>>>>>
>>>>>>>>>> On Mon, Mar 20, 2023 at 1:28 PM Stephen Smalley
>>>>>>>>>> <stephen.smalley.work@gmail.com> wrote:
>>>>>>>>>>> Hmm...that's interesting. I just tried in Fedora using one of the
>>>>>>>>>>> type_transitions already defined in the default policy and
>>>>>>>>>>> although it
>>>>>>>>>>> appears to use the type_transition to compute the new SID for the
>>>>>>>>>>> create check, ls -Z of the file after creation showed it labeled
>>>>>>>>>>> cgroup_t instead. So it doesn't appear to be working or I
>>>>>>>>>>> am doing it
>>>>>>>>>>> wrong.
>>>>>>>>>
>>>>>>>>> I am totally confused now as well because Christian on IRC say's it
>>>>>>>>> works for him but I cannot get it to work here and I tried various
>>>>>>>>> combinations
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Reproducer, on F34,
>>>>>>>>>> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
>>>>>>>>>> mkdir: cannot create directory
>>>>>>>>>> ‘/sys/fs/cgroup/system.slice/.snapshots’: Permission denied
>>>>>>>>>> $ sudo ausearch -m AVC -ts recent -i
>>>>>>>>>> ----
>>>>>>>>>> type=AVC msg=audit(03/20/2023 13:00:04.699:47156) : avc:
>>>>>>>>>> denied  {
>>>>>>>>>> associate } for  pid=152325 comm=mkdir name=.snapshots
>>>>>>>>>> scontext=unconfined_u:object_r:snapperd_data_t:s0
>>>>>>>>>> tcontext=system_u:object_r:cgroup_t:s0 tclass=filesystem
>>>>>>>>>> permissive=0
>>>>>>>>>> $ seinfo --fs_use | grep cgroup
>>>>>>>>>> $ seinfo --genfscon | grep cgroup
>>>>>>>>>>      genfscon cgroup /  system_u:object_r:cgroup_t:s0
>>>>>>>>>>      genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
>>>>>>>>>> $ sesearch -T -s unconfined_t -t cgroup_t -c dir
>>>>>>>>>> type_transition unconfined_t cgroup_t:dir snapperd_data_t
>>>>>>>>>> .snapshots
>>>>>>>>>> $ sudo setenforce 0
>>>>>>>>>> $ sudo mkdir /sys/fs/cgroup/system.slice/.snapshots
>>>>>>>>>> $ ls -Zd /sys/fs/cgroup/system.slice/.snapshots
>>>>>>>>>> system_u:object_r:cgroup_t:s0
>>>>>>>>>> /sys/fs/cgroup/system.slice/.snapshots
>>>>>>>>>
>>>>>>>>> -- gpg --locate-keys dominick.grift@defensec.nl
>>>>>>>>> Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F
>>>>>>>>> 10F6 4098
>>>>>>>>> Dominick Grift
>>>>>>>>
>>>>>>>> Debian sid (Linux debianBullseye 6.1.0-6-amd64 #1 SMP
>>>>>>>> PREEMPT_DYNAMIC
>>>>>>>> Debian 6.1.15-1 (2023-03-05) x86_64 GNU/Linux):
>>>>>>>>
>>>>>>>> type cgroup_test_t;
>>>>>>>> allow cgroup_test_t cgroup_t:filesystem associate;
>>>>>>>> filetrans_pattern(sysadm_t, cgroup_t, cgroup_test_t, dir, "testdir")
>>>>>>>> allow sysadm_t cgroup_test_t:dir { create_dir_perms
>>>>>>>> list_dir_perms };
>>>>>>>> allow sysadm_t cgroup_test_t:file getattr;
>>>>>>>>
>>>>>>>>
>>>>>>>> $ seinfo --all | grep cgroup
>>>>>>>> genfscon cgroup /  system_u:object_r:cgroup_t:s0
>>>>>>>> genfscon cgroup2 /  system_u:object_r:cgroup_t:s0
>>>>>>>> genfscon proc /cgroups  system_u:object_r:proc_info_t:s0
>>>>>>>> cgroup_seclabel
>>>>>>>> cgroup_t
>>>>>>>> cgroup_test_t
>>>>>>>> systemd_cgroups_agent_exec_t
>>>>>>>> systemd_cgroups_agent_runtime_t
>>>>>>>> systemd_cgroups_agent_t
>>>>>>>>
>>>>>>>>
>>>>>>>> $ grep cgroup /etc/selinux/debian/contexts/files/file_contexts
>>>>>>>> /cgroup/.*              <<none>>
>>>>>>>> /sys/fs/cgroup/.*               <<none>>
>>>>>>>> /sys/fs/cgroup/[^/]+            -l
>>>>>>>> system_u:object_r:cgroup_t:s0
>>>>>>>> /cgroup         -d      system_u:object_r:cgroup_t:s0
>>>>>>>> /sys/fs/cgroup          -d      system_u:object_r:cgroup_t:s0
>>>>>>>> /usr/lib/systemd/systemd-cgroups-agent          --
>>>>>>>> system_u:object_r:systemd_cgroups_agent_exec_t:s0
>>>>>>>>
>>>>>>>>
>>>>>>>> $ mkdir /sys/fs/cgroup/system.slice/testdir
>>>>>>>> $ ls -laZ /sys/fs/cgroup/system.slice/testdir/
>>>>>>>> total 0
>>>>>>>> drwxr-x---.  2 root root root:object_r:cgroup_test_t:s0 0 Mar
>>>>>>>> 20 19:19
>>>>>>>> .
>>>>>>>> drwxr-xr-x. 19 root root system_u:object_r:cgroup_t:s0  0 Mar
>>>>>>>> 20 19:19
>>>>>>>> ..
>>>>>>>> -r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar
>>>>>>>> 20 19:19
>>>>>>>> cgroup.controllers
>>>>>>>> -r--r--r--.  1 root root root:object_r:cgroup_test_t:s0 0 Mar
>>>>>>>> 20 19:19
>>>>>>>> cgroup.events
>>>>>>>
>>>>>>> Hmm...I don't get the same result with 6.1.14-200.fc37.x86_64, using
>>>>>>> the corresponding slightly tweaked policy module:
>>>>>>> policy_module(cgrouptest, 1.0)
>>>>>>> require {
>>>>>>> type cgroup_t;
>>>>>>> type unconfined_t;
>>>>>>> }
>>>>>>> type cgroup_test_t;
>>>>>>> allow cgroup_test_t cgroup_t:filesystem associate;
>>>>>>> filetrans_pattern(unconfined_t, cgroup_t, cgroup_test_t, dir,
>>>>>>> "testdir")
>>>>>>> allow unconfined_t cgroup_test_t:dir { create_dir_perms
>>>>>>> list_dir_perms };
>>>>>>> allow unconfined_t cgroup_test_t:file getattr;
>>>>>>>
>>>>>>> That's on Fedora 37, not 34, sorry for the typo.
>>>>>>
>>>>>> Ah, now I remembered that we made it such that the transitions would
>>>>>> only apply if the parent directory has a label explicitly set by
>>>>>> userspace (via setxattr). Not sure if we can improve it easily, since
>>>>>> we can't use the normal inode-based logic for cgroupfs (the xattrs are
>>>>>> stored in kernfs nodes, each of which can be exposed via multiple
>>>>>> inodes if there is more than one cgroupfs mount).
>>>>>
>>>>> Thanks. I can confirm that this indeed enabled transition
>>>>> functionality.
>>>>>
>>>>> It does not solve my memory.pressure challenge but I implementing it
>>>>> regardless in hopes that it addresses the races I encountered when
>>>>> solely relying on genfscon for user.slice
>>>>>
>>>>> https://git.defensec.nl/?p=dssp5.git;a=commitdiff;h=1920c9f751445bfd51f43a7c4e9b7fedda057d15
>>>>>
>>>>> We should probably document this "gotcha" in the selinux-notebook
>>>>>
>>>>
>>>> Just to unify some other threads of conversation that has been going on
>>>> for this.
>>>>
>>>> I helped the author of the initial PR that started this discussion.  We
>>>> knew we needed a new unique label and I suggested that we try a named
>>>> file trans pattern from init_t just to see if it works, and it seemed to
>>>> right out of the gates.  We didn't need to flip any other switches on
>>>> our test environment.
>>>>
>>>> Here is an example of an AVC we are seeing:
>>>> AVC avc:  denied  { getattr } for  pid=5953 comm="systemd"
>>>> path="/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/memory.pressure"
>>>> dev="cgroup2" ino=27721 scontext=unconfined_u:unconfined_r:unconfined_t
>>>> tcontext=system_u:object_r:memory_pressure_t tclass=file permissive=0
>>>>
>>>> I do fear there is something different from the other folks that have
>>>> tested this and our setup, since out setup is fairly bespoke compared to
>>>> your standard Linux distro.  But off the top of my head I don't know any
>>>> special setting we would have in place to make this work.
>>>
>>> Questions:
>>> - Did systemd or some other userspace process first set the context of
>>> /sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service
>>> explicitly?
>>> - Could you post the exact type_transition rule(s) from your policy,
>>> e.g. sesearch -T -s unconfined_t -D memory_pressure_t?
>>> - Does ls -Z of the file also report that context?
>>> - Kernel version?
>> 1. We believe it is systemd.  At the very least its nothing we are
>>     directly doing.
>> 2. type_transition init_t cgroup_t:file memory_pressure_t memory.pressure;
>>     In the above example unconfined_t was just trying to access it but
>>     we have the trans coming from init_t
>> 3. Yes ls -Z shows the proper context as well.
>> 4. For this specific test it was 5.10.154 but we have 5.10.x in some
>>     of our other testing environments.
>
> Clarification on 1. I meant to say that we aren't doing anything special
> ourselves.  Nothing is being hand labeled unless systemd is doing
> something unknown under the hood.
>

I was considering that as well but this seems unlikely as the
memory.pressure labels aren't backed up with filecons. That means that
systemd is not using setfscreatecon (but that would be unlikely to apply
anyway) and it does not reset the context manually because it has no way
to determine what the label should be.

Really strange. I guess I was just overlooking something.
Would be nice to figure out what it is I am missing here.

-- 
gpg --locate-keys dominick.grift@defensec.nl
Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
Dominick Grift

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: cgroup2 labeling question
  2023-03-23 13:55                                     ` Matthew Sheets
  2023-03-23 14:42                                       ` Matthew Sheets
@ 2023-03-23 16:56                                       ` Stephen Smalley
  1 sibling, 0 replies; 29+ messages in thread
From: Stephen Smalley @ 2023-03-23 16:56 UTC (permalink / raw)
  To: Matthew Sheets
  Cc: Dominick Grift, Ondrej Mosnacek, Christian Göttsche,
	Paul Moore, selinux

On Thu, Mar 23, 2023 at 9:55 AM Matthew Sheets
<masheets@linux.microsoft.com> wrote:
>
>
>
> On 3/22/2023 10:27 AM, Stephen Smalley wrote:
> > On Wed, Mar 22, 2023 at 1:07 PM Matthew Sheets
> > <masheets@linux.microsoft.com> wrote:
> >> I helped the author of the initial PR that started this discussion.  We
> >> knew we needed a new unique label and I suggested that we try a named
> >> file trans pattern from init_t just to see if it works, and it seemed to
> >> right out of the gates.  We didn't need to flip any other switches on
> >> our test environment.
> >>
> >> Here is an example of an AVC we are seeing:
> >> AVC avc:  denied  { getattr } for  pid=5953 comm="systemd"
> >> path="/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/memory.pressure"
> >> dev="cgroup2" ino=27721 scontext=unconfined_u:unconfined_r:unconfined_t
> >> tcontext=system_u:object_r:memory_pressure_t tclass=file permissive=0
> >>
> >> I do fear there is something different from the other folks that have
> >> tested this and our setup, since out setup is fairly bespoke compared to
> >> your standard Linux distro.  But off the top of my head I don't know any
> >> special setting we would have in place to make this work.
> >
> > Questions:
> > - Did systemd or some other userspace process first set the context of
> > /sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service
> > explicitly?
> > - Could you post the exact type_transition rule(s) from your policy,
> > e.g. sesearch -T -s unconfined_t -D memory_pressure_t?
> > - Does ls -Z of the file also report that context?
> > - Kernel version?
>
> 1. We believe it is systemd.  At the very least its nothing we are
>     directly doing.
> 2. type_transition init_t cgroup_t:file memory_pressure_t memory.pressure;
>     In the above example unconfined_t was just trying to access it but
>     we have the trans coming from init_t
> 3. Yes ls -Z shows the proper context as well.
> 4. For this specific test it was 5.10.154 but we have 5.10.x in some
>     of our other testing environments.

So if I add that type_transition to Fedora policy and reboot, some of
the memory.pressure files are labeled memory_pressure_t while others
are labeled cgroup_t, as shown below. Im guessing this has to do with
what process was current when the file was created (or some files
created before policy load), but not sure.

$ sudo find /sys/fs/cgroup -name memory.pressure -exec ls -Z {} \;
system_u:object_r:memory_pressure_t:s0
/sys/fs/cgroup/sys-fs-fuse-connections.mount/memory.pressure
system_u:object_r:memory_pressure_t:s0
/sys/fs/cgroup/sys-kernel-config.mount/memory.pressure
system_u:object_r:memory_pressure_t:s0
/sys/fs/cgroup/sys-kernel-debug.mount/memory.pressure
system_u:object_r:cgroup_t:s0 /sys/fs/cgroup/memory.pressure
system_u:object_r:memory_pressure_t:s0
/sys/fs/cgroup/dev-mqueue.mount/memory.pressure
system_u:object_r:memory_pressure_t:s0
/sys/fs/cgroup/user.slice/user-982.slice/memory.pressure
system_u:object_r:memory_pressure_t:s0
/sys/fs/cgroup/user.slice/user-982.slice/session-c1.scope/memory.pressure
system_u:object_r:memory_pressure_t:s0
/sys/fs/cgroup/user.slice/user-982.slice/user@982.service/memory.pressure
unconfined_u:object_r:cgroup_t:s0
/sys/fs/cgroup/user.slice/user-982.slice/user@982.service/app.slice/memory.pressure
unconfined_u:object_r:cgroup_t:s0
/sys/fs/cgroup/user.slice/user-982.slice/user@982.service/app.slice/dbus.socket/memory.pressure
unconfined_u:object_r:cgroup_t:s0
/sys/fs/cgroup/user.slice/user-982.slice/user@982.service/init.scope/memory.pressure
system_u:object_r:memory_pressure_t:s0 /sys/fs/cgroup/user.slice/memory.pressure
system_u:object_r:memory_pressure_t:s0
/sys/fs/cgroup/user.slice/user-0.slice/session-2.scope/memory.pressure
system_u:object_r:memory_pressure_t:s0
/sys/fs/cgroup/user.slice/user-0.slice/memory.pressure
system_u:object_r:memory_pressure_t:s0
/sys/fs/cgroup/user.slice/user-0.slice/user@0.service/memory.pressure
unconfined_u:object_r:cgroup_t:s0
/sys/fs/cgroup/user.slice/user-0.slice/user@0.service/app.slice/memory.pressure
unconfined_u:object_r:cgroup_t:s0
/sys/fs/cgroup/user.slice/user-0.slice/user@0.service/app.slice/dbus.socket/memory.pressure
unconfined_u:object_r:cgroup_t:s0
/sys/fs/cgroup/user.slice/user-0.slice/user@0.service/init.scope/memory.pressure
system_u:object_r:memory_pressure_t:s0
/sys/fs/cgroup/sys-kernel-tracing.mount/memory.pressure
system_u:object_r:cgroup_t:s0 /sys/fs/cgroup/init.scope/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/irqbalance.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/abrt-journal-core.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/mcafee.ma.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/sysroot.mount/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/nessusagent.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/systemd-udevd.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/systemd-udevd.service/udev/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/dbus-broker.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/systemd-homed.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/oddjobd.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/boot.mount/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/vgauthd.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/cockpit.socket/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/polkit.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/chronyd.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/auditd.service/memory.pressure
system_u:object_r:cgroup_t:s0 /sys/fs/cgroup/system.slice/memory.pressure
system_u:object_r:cgroup_t:s0
'/sys/fs/cgroup/system.slice/system-sshd\x2dkeygen.slice/memory.pressure'
system_u:object_r:cgroup_t:s0
'/sys/fs/cgroup/system.slice/system-dbus\x2d:1.3\x2dorg.fedoraproject.SetroubleshootPrivileged.slice/memory.pressure'
system_u:object_r:cgroup_t:s0
'/sys/fs/cgroup/system.slice/system-dbus\x2d:1.3\x2dorg.fedoraproject.SetroubleshootPrivileged.slice/dbus-:1.3-org.fedoraproject.SetroubleshootPrivileged@1.service/memory.pressure'
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/dev-zram0.swap/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/abrt-xorg.service/memory.pressure
system_u:object_r:cgroup_t:s0
'/sys/fs/cgroup/system.slice/system-systemd\x2dzram\x2dsetup.slice/memory.pressure'
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/system-modprobe.slice/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/libvirtd.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/ModemManager.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/systemd-journald.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/atd.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/sshd.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/crond.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/NetworkManager.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/systemd-machined.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/gssproxy.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/rpc-gssd.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/rsyslog.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/abrtd.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/tmp.mount/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/firewalld.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/systemd-userdbd.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/setroubleshootd.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/vmtoolsd.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/sssd.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/cups.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/systemd-oomd.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/mcelog.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/systemd-resolved.service/memory.pressure
system_u:object_r:cgroup_t:s0
'/sys/fs/cgroup/system.slice/system-lvm2\x2dpvscan.slice/memory.pressure'
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/system-getty.slice/getty@tty1.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/system-getty.slice/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/avahi-daemon.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/systemd-logind.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/abrt-oops.service/memory.pressure
system_u:object_r:cgroup_t:s0
/sys/fs/cgroup/system.slice/var-lib-nfs-rpc_pipefs.mount/memory.pressure
system_u:object_r:memory_pressure_t:s0
/sys/fs/cgroup/machine.slice/memory.pressure
system_u:object_r:memory_pressure_t:s0
/sys/fs/cgroup/dev-hugepages.mount/memory.pressure

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2023-03-23 16:56 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-20  7:23 cgroup2 labeling question Dominick Grift
2023-03-20 13:35 ` Stephen Smalley
2023-03-20 13:57   ` Dominick Grift
2023-03-20 14:12     ` Ondrej Mosnacek
2023-03-20 14:19       ` Dominick Grift
2023-03-20 14:43         ` Dominick Grift
2023-03-20 14:46         ` Ondrej Mosnacek
2023-03-20 15:16           ` Stephen Smalley
2023-03-20 15:23             ` Dominick Grift
2023-03-20 16:32               ` Stephen Smalley
2023-03-20 16:37                 ` Dominick Grift
2023-03-20 17:28                   ` Stephen Smalley
2023-03-20 17:53                     ` Stephen Smalley
2023-03-20 18:07                       ` Dominick Grift
2023-03-20 18:22                         ` Christian Göttsche
2023-03-20 20:23                           ` Stephen Smalley
2023-03-21 13:40                             ` Ondrej Mosnacek
2023-03-21 14:42                               ` Dominick Grift
2023-03-22 17:07                                 ` Matthew Sheets
2023-03-22 17:15                                   ` Dominick Grift
2023-03-22 17:27                                   ` Stephen Smalley
2023-03-23 13:55                                     ` Matthew Sheets
2023-03-23 14:42                                       ` Matthew Sheets
2023-03-23 14:53                                         ` Dominick Grift
2023-03-23 16:56                                       ` Stephen Smalley
2023-03-20 18:15                       ` Stephen Smalley
2023-03-20 18:19                         ` Dominick Grift
2023-03-20 18:22                           ` Stephen Smalley
2023-03-20 18:26                             ` Dominick Grift

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.