All of lore.kernel.org
 help / color / mirror / Atom feed
* Race condition during policy load in kernel
@ 2020-04-27 20:40 Daniel Burgener
  2020-04-28 18:54 ` Stephen Smalley
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Burgener @ 2020-04-27 20:40 UTC (permalink / raw)
  To: Stephen Smalley, selinux list; +Cc: james.morris

Hello all,

We've noticed a few instances of strange failures in userspace object 
managers happening immediately after a policy load, that we believe 
we've traced to a race condition in the kernel, and wanted to get your 
thoughts on our diagnosis and the appropriate fix before creating a 
patch in case we've missed something.

The issue is that userspace object managers rely on /sys/fs/selinux to 
determine the mapping of object class and permission strings to numeric 
identifiers, however immediately after a policy load, 
/sys/fs/selinux/{booleans,class,policy_capabilities}, are deleted and 
recreated.  This leaves a window where the object classes (etc) appear 
to be not defined, even if they are defined in both the old and new 
policies.

We have observed this with both dbus and systemd userspace object 
managers, and it can be reproduced straightforwardly by running the 
following (borrowed from bug linked below):

# (while true; do cat /sys/fs/selinux/class/service/index >/dev/null; 
done) &
# while true; do sudo load_policy; echo -n .; sleep 0.1;done

Periodically, we will get "No such file or directory" messages printed 
to stderr.  In the event of a userspace object manager using libselinux 
to check a userspace permission, that will result in a USER_AVC message 
indicating an unknown object class, and in the event that 
--handle-unknown is set to "deny", it will deny the access.

It seems to me as though some sort of locking should occur in the 
selinuxfs to prevent access to the files being deleted and recreated 
during the policy load, so that userspace programs relying on them (in 
particular userspace object managers doing class lookups) get a 
consistent view of the classes, perms, booleans and capabilities in the 
loaded policy.

This seems to be related to 
https://github.com/SELinuxProject/selinux-kernel/issues/42 but I believe 
it is a different case.  The deadlock in that bug seems to be related to 
the underlying filesystem functions, specifically around directory 
deletion while this is an issue the selinuxfs logic specifically. The 
above linked issue appears to have been fixed in recent upstream 
kernels, per the bug, but I have verified the issue I am discussing here 
in 5.7.0 rc3.

It seems to me as though from the perspective of userspace that all of 
sel_make_policy_nodes (or at least all of each of its component 
functions) should be atomic.  There was some discussion in a previous 
thread 
(https://lore.kernel.org/selinux/20181002155810.GP32577@ZenIV.linux.org.uk/) 
around a significant refactor of policy loading in general.  It appears 
as though the direct issue there of access during the deletion has been 
resolved (commit d4f4de5e5ef8efde85febb6876cd3c8ab1631999), although the 
complete suggested fix of refactoring policy_load into two parts has not 
been done.  Would that refactor be the right approach to the problem I 
am trying to solve?  Would a patch for adding locking around the 
selinuxfs delete/recreate operation be considered? That wouldn't address 
all the concerns, (namely the potential to access a view of the policy 
that doesn't match the currently loaded policy and error recovery in the 
case that sel_make_nodes fails), but would improve the reliability of 
existing userspace object managers

I'm happy to create and submit a patch, but I wanted to get the 
communities thought on the problem and correct approach first.

-Daniel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Race condition during policy load in kernel
  2020-04-27 20:40 Race condition during policy load in kernel Daniel Burgener
@ 2020-04-28 18:54 ` Stephen Smalley
  2020-04-28 19:34   ` Ondrej Mosnacek
  0 siblings, 1 reply; 7+ messages in thread
From: Stephen Smalley @ 2020-04-28 18:54 UTC (permalink / raw)
  To: Daniel Burgener, Paul Moore, Ondrej Mosnacek; +Cc: selinux list, james.morris

On Mon, Apr 27, 2020 at 4:40 PM Daniel Burgener
<dburgener@linux.microsoft.com> wrote:
>
> Hello all,
>
> We've noticed a few instances of strange failures in userspace object
> managers happening immediately after a policy load, that we believe
> we've traced to a race condition in the kernel, and wanted to get your
> thoughts on our diagnosis and the appropriate fix before creating a
> patch in case we've missed something.
>
> The issue is that userspace object managers rely on /sys/fs/selinux to
> determine the mapping of object class and permission strings to numeric
> identifiers, however immediately after a policy load,
> /sys/fs/selinux/{booleans,class,policy_capabilities}, are deleted and
> recreated.  This leaves a window where the object classes (etc) appear
> to be not defined, even if they are defined in both the old and new
> policies.
>
> We have observed this with both dbus and systemd userspace object
> managers, and it can be reproduced straightforwardly by running the
> following (borrowed from bug linked below):
>
> # (while true; do cat /sys/fs/selinux/class/service/index >/dev/null;
> done) &
> # while true; do sudo load_policy; echo -n .; sleep 0.1;done
>
> Periodically, we will get "No such file or directory" messages printed
> to stderr.  In the event of a userspace object manager using libselinux
> to check a userspace permission, that will result in a USER_AVC message
> indicating an unknown object class, and in the event that
> --handle-unknown is set to "deny", it will deny the access.
>
> It seems to me as though some sort of locking should occur in the
> selinuxfs to prevent access to the files being deleted and recreated
> during the policy load, so that userspace programs relying on them (in
> particular userspace object managers doing class lookups) get a
> consistent view of the classes, perms, booleans and capabilities in the
> loaded policy.
>
> This seems to be related to
> https://github.com/SELinuxProject/selinux-kernel/issues/42 but I believe
> it is a different case.  The deadlock in that bug seems to be related to
> the underlying filesystem functions, specifically around directory
> deletion while this is an issue the selinuxfs logic specifically. The
> above linked issue appears to have been fixed in recent upstream
> kernels, per the bug, but I have verified the issue I am discussing here
> in 5.7.0 rc3.
>
> It seems to me as though from the perspective of userspace that all of
> sel_make_policy_nodes (or at least all of each of its component
> functions) should be atomic.  There was some discussion in a previous
> thread
> (https://lore.kernel.org/selinux/20181002155810.GP32577@ZenIV.linux.org.uk/)
> around a significant refactor of policy loading in general.  It appears
> as though the direct issue there of access during the deletion has been
> resolved (commit d4f4de5e5ef8efde85febb6876cd3c8ab1631999), although the
> complete suggested fix of refactoring policy_load into two parts has not
> been done.  Would that refactor be the right approach to the problem I
> am trying to solve?  Would a patch for adding locking around the
> selinuxfs delete/recreate operation be considered? That wouldn't address
> all the concerns, (namely the potential to access a view of the policy
> that doesn't match the currently loaded policy and error recovery in the
> case that sel_make_nodes fails), but would improve the reliability of
> existing userspace object managers
>
> I'm happy to create and submit a patch, but I wanted to get the
> communities thought on the problem and correct approach first.

I think the best approach is the one sketched by viro but I recognize
that's a big lift.
Willing to help but not entirely clear on the interactions with the
dcache for atomically swapping
the new trees into place.

I suspect this is just being exposed now due to more recent changes in
userspace to try to fully support
changes in class/permission values at runtime.  Previously userspace
object managers tended to only
map them during initialization or on first use and then would just
keep using the cached values, whereas
now they try to map them lazily and flush their caches on a reload
notification.  Some of those changes
were in libselinux and others in the userspace object managers (e.g.
systemd, dbus-broker or dbusd).
Not sure exactly what versions of libselinux, systemd, and
dbusd/dbus-broker you are using.

selinuxfs itself does take a mutex around the entire policy load
including the delete/re-create sequence but
that only serializes with other policy operations (load, read,
booleans), not with dcache walk/lookup.

cc'ing Ondrej since he attempted to fix the earlier selinuxfs bug and
may have an opinion on the best way
forward for this issue.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Race condition during policy load in kernel
  2020-04-28 18:54 ` Stephen Smalley
@ 2020-04-28 19:34   ` Ondrej Mosnacek
  2020-04-28 20:38     ` Daniel Burgener
  0 siblings, 1 reply; 7+ messages in thread
From: Ondrej Mosnacek @ 2020-04-28 19:34 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: Daniel Burgener, Paul Moore, selinux list, james.morris

On Tue, Apr 28, 2020 at 8:54 PM Stephen Smalley
<stephen.smalley.work@gmail.com> wrote:
> On Mon, Apr 27, 2020 at 4:40 PM Daniel Burgener
> <dburgener@linux.microsoft.com> wrote:
> >
> > Hello all,
> >
> > We've noticed a few instances of strange failures in userspace object
> > managers happening immediately after a policy load, that we believe
> > we've traced to a race condition in the kernel, and wanted to get your
> > thoughts on our diagnosis and the appropriate fix before creating a
> > patch in case we've missed something.
> >
> > The issue is that userspace object managers rely on /sys/fs/selinux to
> > determine the mapping of object class and permission strings to numeric
> > identifiers, however immediately after a policy load,
> > /sys/fs/selinux/{booleans,class,policy_capabilities}, are deleted and
> > recreated.  This leaves a window where the object classes (etc) appear
> > to be not defined, even if they are defined in both the old and new
> > policies.
> >
> > We have observed this with both dbus and systemd userspace object
> > managers, and it can be reproduced straightforwardly by running the
> > following (borrowed from bug linked below):
> >
> > # (while true; do cat /sys/fs/selinux/class/service/index >/dev/null;
> > done) &
> > # while true; do sudo load_policy; echo -n .; sleep 0.1;done
> >
> > Periodically, we will get "No such file or directory" messages printed
> > to stderr.  In the event of a userspace object manager using libselinux
> > to check a userspace permission, that will result in a USER_AVC message
> > indicating an unknown object class, and in the event that
> > --handle-unknown is set to "deny", it will deny the access.
> >
> > It seems to me as though some sort of locking should occur in the
> > selinuxfs to prevent access to the files being deleted and recreated
> > during the policy load, so that userspace programs relying on them (in
> > particular userspace object managers doing class lookups) get a
> > consistent view of the classes, perms, booleans and capabilities in the
> > loaded policy.
> >
> > This seems to be related to
> > https://github.com/SELinuxProject/selinux-kernel/issues/42 but I believe
> > it is a different case.  The deadlock in that bug seems to be related to
> > the underlying filesystem functions, specifically around directory
> > deletion while this is an issue the selinuxfs logic specifically. The
> > above linked issue appears to have been fixed in recent upstream
> > kernels, per the bug, but I have verified the issue I am discussing here
> > in 5.7.0 rc3.
> >
> > It seems to me as though from the perspective of userspace that all of
> > sel_make_policy_nodes (or at least all of each of its component
> > functions) should be atomic.  There was some discussion in a previous
> > thread
> > (https://lore.kernel.org/selinux/20181002155810.GP32577@ZenIV.linux.org.uk/)
> > around a significant refactor of policy loading in general.  It appears
> > as though the direct issue there of access during the deletion has been
> > resolved (commit d4f4de5e5ef8efde85febb6876cd3c8ab1631999), although the
> > complete suggested fix of refactoring policy_load into two parts has not
> > been done.  Would that refactor be the right approach to the problem I
> > am trying to solve?  Would a patch for adding locking around the
> > selinuxfs delete/recreate operation be considered? That wouldn't address
> > all the concerns, (namely the potential to access a view of the policy
> > that doesn't match the currently loaded policy and error recovery in the
> > case that sel_make_nodes fails), but would improve the reliability of
> > existing userspace object managers
> >
> > I'm happy to create and submit a patch, but I wanted to get the
> > communities thought on the problem and correct approach first.
>
> I think the best approach is the one sketched by viro but I recognize
> that's a big lift.
> Willing to help but not entirely clear on the interactions with the
> dcache for atomically swapping
> the new trees into place.
>
> I suspect this is just being exposed now due to more recent changes in
> userspace to try to fully support
> changes in class/permission values at runtime.  Previously userspace
> object managers tended to only
> map them during initialization or on first use and then would just
> keep using the cached values, whereas
> now they try to map them lazily and flush their caches on a reload
> notification.  Some of those changes
> were in libselinux and others in the userspace object managers (e.g.
> systemd, dbus-broker or dbusd).
> Not sure exactly what versions of libselinux, systemd, and
> dbusd/dbus-broker you are using.
>
> selinuxfs itself does take a mutex around the entire policy load
> including the delete/re-create sequence but
> that only serializes with other policy operations (load, read,
> booleans), not with dcache walk/lookup.
>
> cc'ing Ondrej since he attempted to fix the earlier selinuxfs bug and
> may have an opinion on the best way
> forward for this issue.

Well, I attempted a few times and each time failed miserably :)
Thankfully, hitting the bug via another fs eventually forced viro to
fix it (in a way I could probably never come up with myself). However,
I remember trying to originally fix the bug by means of making the
swapover atomic, but later I realized that these two are actually
independent issues. After that I didn't get back to atomizing the
swapover, but IIRC I had the impression that it might not be all that
difficult... (at least if I ignore possible failures during the new
dentry tree creation for now - it would still be unsafe, but it would
be a start). But knowing VFS, I bet when I actually try it will prove
to be much more tricky ;)

I'll try to find some time to sit down to it again, but at the moment
I'm juggling a bunch of higher priority stuff so it might be a while
before I get to it.

-- 
Ondrej Mosnacek <omosnace at redhat dot com>
Software Engineer, Security Technologies
Red Hat, Inc.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Race condition during policy load in kernel
  2020-04-28 19:34   ` Ondrej Mosnacek
@ 2020-04-28 20:38     ` Daniel Burgener
  2020-04-29 12:32       ` Stephen Smalley
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Burgener @ 2020-04-28 20:38 UTC (permalink / raw)
  To: Ondrej Mosnacek, Stephen Smalley; +Cc: Paul Moore, selinux list, james.morris

On 4/28/20 3:34 PM, Ondrej Mosnacek wrote:
> On Tue, Apr 28, 2020 at 8:54 PM Stephen Smalley
> <stephen.smalley.work@gmail.com> wrote:
>> On Mon, Apr 27, 2020 at 4:40 PM Daniel Burgener
>> <dburgener@linux.microsoft.com> wrote:
>>> Hello all,
>>>
>>> We've noticed a few instances of strange failures in userspace object
>>> managers happening immediately after a policy load, that we believe
>>> we've traced to a race condition in the kernel, and wanted to get your
>>> thoughts on our diagnosis and the appropriate fix before creating a
>>> patch in case we've missed something.
>>>
>>> The issue is that userspace object managers rely on /sys/fs/selinux to
>>> determine the mapping of object class and permission strings to numeric
>>> identifiers, however immediately after a policy load,
>>> /sys/fs/selinux/{booleans,class,policy_capabilities}, are deleted and
>>> recreated.  This leaves a window where the object classes (etc) appear
>>> to be not defined, even if they are defined in both the old and new
>>> policies.
>>>
>>> We have observed this with both dbus and systemd userspace object
>>> managers, and it can be reproduced straightforwardly by running the
>>> following (borrowed from bug linked below):
>>>
>>> # (while true; do cat /sys/fs/selinux/class/service/index >/dev/null;
>>> done) &
>>> # while true; do sudo load_policy; echo -n .; sleep 0.1;done
>>>
>>> Periodically, we will get "No such file or directory" messages printed
>>> to stderr.  In the event of a userspace object manager using libselinux
>>> to check a userspace permission, that will result in a USER_AVC message
>>> indicating an unknown object class, and in the event that
>>> --handle-unknown is set to "deny", it will deny the access.
>>>
>>> It seems to me as though some sort of locking should occur in the
>>> selinuxfs to prevent access to the files being deleted and recreated
>>> during the policy load, so that userspace programs relying on them (in
>>> particular userspace object managers doing class lookups) get a
>>> consistent view of the classes, perms, booleans and capabilities in the
>>> loaded policy.
>>>
>>> This seems to be related to
>>> https://github.com/SELinuxProject/selinux-kernel/issues/42 but I believe
>>> it is a different case.  The deadlock in that bug seems to be related to
>>> the underlying filesystem functions, specifically around directory
>>> deletion while this is an issue the selinuxfs logic specifically. The
>>> above linked issue appears to have been fixed in recent upstream
>>> kernels, per the bug, but I have verified the issue I am discussing here
>>> in 5.7.0 rc3.
>>>
>>> It seems to me as though from the perspective of userspace that all of
>>> sel_make_policy_nodes (or at least all of each of its component
>>> functions) should be atomic.  There was some discussion in a previous
>>> thread
>>> (https://lore.kernel.org/selinux/20181002155810.GP32577@ZenIV.linux.org.uk/)
>>> around a significant refactor of policy loading in general.  It appears
>>> as though the direct issue there of access during the deletion has been
>>> resolved (commit d4f4de5e5ef8efde85febb6876cd3c8ab1631999), although the
>>> complete suggested fix of refactoring policy_load into two parts has not
>>> been done.  Would that refactor be the right approach to the problem I
>>> am trying to solve?  Would a patch for adding locking around the
>>> selinuxfs delete/recreate operation be considered? That wouldn't address
>>> all the concerns, (namely the potential to access a view of the policy
>>> that doesn't match the currently loaded policy and error recovery in the
>>> case that sel_make_nodes fails), but would improve the reliability of
>>> existing userspace object managers
>>>
>>> I'm happy to create and submit a patch, but I wanted to get the
>>> communities thought on the problem and correct approach first.
>> I think the best approach is the one sketched by viro but I recognize
>> that's a big lift.
>> Willing to help but not entirely clear on the interactions with the
>> dcache for atomically swapping
>> the new trees into place.
>>
>> I suspect this is just being exposed now due to more recent changes in
>> userspace to try to fully support
>> changes in class/permission values at runtime.  Previously userspace
>> object managers tended to only
>> map them during initialization or on first use and then would just
>> keep using the cached values, whereas
>> now they try to map them lazily and flush their caches on a reload
>> notification.  Some of those changes
>> were in libselinux and others in the userspace object managers (e.g.
>> systemd, dbus-broker or dbusd).
>> Not sure exactly what versions of libselinux, systemd, and
>> dbusd/dbus-broker you are using.
>>
>> selinuxfs itself does take a mutex around the entire policy load
>> including the delete/re-create sequence but
>> that only serializes with other policy operations (load, read,
>> booleans), not with dcache walk/lookup.
>>
>> cc'ing Ondrej since he attempted to fix the earlier selinuxfs bug and
>> may have an opinion on the best way
>> forward for this issue.
> Well, I attempted a few times and each time failed miserably :)
> Thankfully, hitting the bug via another fs eventually forced viro to
> fix it (in a way I could probably never come up with myself). However,
> I remember trying to originally fix the bug by means of making the
> swapover atomic, but later I realized that these two are actually
> independent issues. After that I didn't get back to atomizing the
> swapover, but IIRC I had the impression that it might not be all that
> difficult... (at least if I ignore possible failures during the new
> dentry tree creation for now - it would still be unsafe, but it would
> be a start). But knowing VFS, I bet when I actually try it will prove
> to be much more tricky ;)
>
> I'll try to find some time to sit down to it again, but at the moment
> I'm juggling a bunch of higher priority stuff so it might be a while
> before I get to it.
>
The weird thing about atomizing the swapover in isolation is that it's 
unclear what action to take on failure.  The existing code looks like it 
would leave a broken version of the selinuxfs laying around.  Completing 
the swapover even on failure would give us an atomic version of the 
current situation, but feels extra weird. If ignoring the failure case 
as Ondrej suggests would be acceptable in a patch, that sounds like the 
quickest way to addressing at least part of the problem (the refactor of 
security_load_policy by itself doesn't get us much by itself, since we 
can't roll back the selinuxfs after we've started in the current state.)

I'd be happy to start looking at either half, although I'd prefer to 
start with atomizing the tree swapover to solve my immediate problem if 
that would have a chance at getting merged by itself given the issues 
around the failure case.

-Daniel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Race condition during policy load in kernel
  2020-04-28 20:38     ` Daniel Burgener
@ 2020-04-29 12:32       ` Stephen Smalley
  2020-04-29 12:40         ` Daniel Burgener
  0 siblings, 1 reply; 7+ messages in thread
From: Stephen Smalley @ 2020-04-29 12:32 UTC (permalink / raw)
  To: Daniel Burgener; +Cc: Ondrej Mosnacek, Paul Moore, selinux list, james.morris

On Tue, Apr 28, 2020 at 4:38 PM Daniel Burgener
<dburgener@linux.microsoft.com> wrote:
>
> On 4/28/20 3:34 PM, Ondrej Mosnacek wrote:
> > On Tue, Apr 28, 2020 at 8:54 PM Stephen Smalley
> > <stephen.smalley.work@gmail.com> wrote:
> >> On Mon, Apr 27, 2020 at 4:40 PM Daniel Burgener
> >> <dburgener@linux.microsoft.com> wrote:
> >>> Hello all,
> >>>
> >>> We've noticed a few instances of strange failures in userspace object
> >>> managers happening immediately after a policy load, that we believe
> >>> we've traced to a race condition in the kernel, and wanted to get your
> >>> thoughts on our diagnosis and the appropriate fix before creating a
> >>> patch in case we've missed something.
> >>>
> >>> The issue is that userspace object managers rely on /sys/fs/selinux to
> >>> determine the mapping of object class and permission strings to numeric
> >>> identifiers, however immediately after a policy load,
> >>> /sys/fs/selinux/{booleans,class,policy_capabilities}, are deleted and
> >>> recreated.  This leaves a window where the object classes (etc) appear
> >>> to be not defined, even if they are defined in both the old and new
> >>> policies.
> >>>
> >>> We have observed this with both dbus and systemd userspace object
> >>> managers, and it can be reproduced straightforwardly by running the
> >>> following (borrowed from bug linked below):
> >>>
> >>> # (while true; do cat /sys/fs/selinux/class/service/index >/dev/null;
> >>> done) &
> >>> # while true; do sudo load_policy; echo -n .; sleep 0.1;done
> >>>
> >>> Periodically, we will get "No such file or directory" messages printed
> >>> to stderr.  In the event of a userspace object manager using libselinux
> >>> to check a userspace permission, that will result in a USER_AVC message
> >>> indicating an unknown object class, and in the event that
> >>> --handle-unknown is set to "deny", it will deny the access.
> >>>
> >>> It seems to me as though some sort of locking should occur in the
> >>> selinuxfs to prevent access to the files being deleted and recreated
> >>> during the policy load, so that userspace programs relying on them (in
> >>> particular userspace object managers doing class lookups) get a
> >>> consistent view of the classes, perms, booleans and capabilities in the
> >>> loaded policy.
> >>>
> >>> This seems to be related to
> >>> https://github.com/SELinuxProject/selinux-kernel/issues/42 but I believe
> >>> it is a different case.  The deadlock in that bug seems to be related to
> >>> the underlying filesystem functions, specifically around directory
> >>> deletion while this is an issue the selinuxfs logic specifically. The
> >>> above linked issue appears to have been fixed in recent upstream
> >>> kernels, per the bug, but I have verified the issue I am discussing here
> >>> in 5.7.0 rc3.
> >>>
> >>> It seems to me as though from the perspective of userspace that all of
> >>> sel_make_policy_nodes (or at least all of each of its component
> >>> functions) should be atomic.  There was some discussion in a previous
> >>> thread
> >>> (https://lore.kernel.org/selinux/20181002155810.GP32577@ZenIV.linux.org.uk/)
> >>> around a significant refactor of policy loading in general.  It appears
> >>> as though the direct issue there of access during the deletion has been
> >>> resolved (commit d4f4de5e5ef8efde85febb6876cd3c8ab1631999), although the
> >>> complete suggested fix of refactoring policy_load into two parts has not
> >>> been done.  Would that refactor be the right approach to the problem I
> >>> am trying to solve?  Would a patch for adding locking around the
> >>> selinuxfs delete/recreate operation be considered? That wouldn't address
> >>> all the concerns, (namely the potential to access a view of the policy
> >>> that doesn't match the currently loaded policy and error recovery in the
> >>> case that sel_make_nodes fails), but would improve the reliability of
> >>> existing userspace object managers
> >>>
> >>> I'm happy to create and submit a patch, but I wanted to get the
> >>> communities thought on the problem and correct approach first.
> >> I think the best approach is the one sketched by viro but I recognize
> >> that's a big lift.
> >> Willing to help but not entirely clear on the interactions with the
> >> dcache for atomically swapping
> >> the new trees into place.
> >>
> >> I suspect this is just being exposed now due to more recent changes in
> >> userspace to try to fully support
> >> changes in class/permission values at runtime.  Previously userspace
> >> object managers tended to only
> >> map them during initialization or on first use and then would just
> >> keep using the cached values, whereas
> >> now they try to map them lazily and flush their caches on a reload
> >> notification.  Some of those changes
> >> were in libselinux and others in the userspace object managers (e.g.
> >> systemd, dbus-broker or dbusd).
> >> Not sure exactly what versions of libselinux, systemd, and
> >> dbusd/dbus-broker you are using.
> >>
> >> selinuxfs itself does take a mutex around the entire policy load
> >> including the delete/re-create sequence but
> >> that only serializes with other policy operations (load, read,
> >> booleans), not with dcache walk/lookup.
> >>
> >> cc'ing Ondrej since he attempted to fix the earlier selinuxfs bug and
> >> may have an opinion on the best way
> >> forward for this issue.
> > Well, I attempted a few times and each time failed miserably :)
> > Thankfully, hitting the bug via another fs eventually forced viro to
> > fix it (in a way I could probably never come up with myself). However,
> > I remember trying to originally fix the bug by means of making the
> > swapover atomic, but later I realized that these two are actually
> > independent issues. After that I didn't get back to atomizing the
> > swapover, but IIRC I had the impression that it might not be all that
> > difficult... (at least if I ignore possible failures during the new
> > dentry tree creation for now - it would still be unsafe, but it would
> > be a start). But knowing VFS, I bet when I actually try it will prove
> > to be much more tricky ;)
> >
> > I'll try to find some time to sit down to it again, but at the moment
> > I'm juggling a bunch of higher priority stuff so it might be a while
> > before I get to it.
> >
> The weird thing about atomizing the swapover in isolation is that it's
> unclear what action to take on failure.  The existing code looks like it
> would leave a broken version of the selinuxfs laying around.  Completing
> the swapover even on failure would give us an atomic version of the
> current situation, but feels extra weird. If ignoring the failure case
> as Ondrej suggests would be acceptable in a patch, that sounds like the
> quickest way to addressing at least part of the problem (the refactor of
> security_load_policy by itself doesn't get us much by itself, since we
> can't roll back the selinuxfs after we've started in the current state.)

I don't entirely understand what you mean by the above, since viro's sketch
of a new policy load sequence does permit handling the error case in the middle
of the new selinuxfs directory/file creation by just throwing away the
new entries
altogether and leaving the old policydb/sidtab + old selinuxfs in
place.  That said
I'm not saying you have to do the whole thing at once.

> I'd be happy to start looking at either half, although I'd prefer to
> start with atomizing the tree swapover to solve my immediate problem if
> that would have a chance at getting merged by itself given the issues
> around the failure case.

Certainly willing to look at incremental improvements here.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Race condition during policy load in kernel
  2020-04-29 12:32       ` Stephen Smalley
@ 2020-04-29 12:40         ` Daniel Burgener
  2020-06-10 13:40           ` Stephen Smalley
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Burgener @ 2020-04-29 12:40 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: Ondrej Mosnacek, Paul Moore, selinux list, james.morris

On 4/29/20 8:32 AM, Stephen Smalley wrote:
> On Tue, Apr 28, 2020 at 4:38 PM Daniel Burgener
> <dburgener@linux.microsoft.com> wrote:
>> On 4/28/20 3:34 PM, Ondrej Mosnacek wrote:
>>> On Tue, Apr 28, 2020 at 8:54 PM Stephen Smalley
>>> <stephen.smalley.work@gmail.com> wrote:
>>>> On Mon, Apr 27, 2020 at 4:40 PM Daniel Burgener
>>>> <dburgener@linux.microsoft.com> wrote:
>>>>> Hello all,
>>>>>
>>>>> We've noticed a few instances of strange failures in userspace object
>>>>> managers happening immediately after a policy load, that we believe
>>>>> we've traced to a race condition in the kernel, and wanted to get your
>>>>> thoughts on our diagnosis and the appropriate fix before creating a
>>>>> patch in case we've missed something.
>>>>>
>>>>> The issue is that userspace object managers rely on /sys/fs/selinux to
>>>>> determine the mapping of object class and permission strings to numeric
>>>>> identifiers, however immediately after a policy load,
>>>>> /sys/fs/selinux/{booleans,class,policy_capabilities}, are deleted and
>>>>> recreated.  This leaves a window where the object classes (etc) appear
>>>>> to be not defined, even if they are defined in both the old and new
>>>>> policies.
>>>>>
>>>>> We have observed this with both dbus and systemd userspace object
>>>>> managers, and it can be reproduced straightforwardly by running the
>>>>> following (borrowed from bug linked below):
>>>>>
>>>>> # (while true; do cat /sys/fs/selinux/class/service/index >/dev/null;
>>>>> done) &
>>>>> # while true; do sudo load_policy; echo -n .; sleep 0.1;done
>>>>>
>>>>> Periodically, we will get "No such file or directory" messages printed
>>>>> to stderr.  In the event of a userspace object manager using libselinux
>>>>> to check a userspace permission, that will result in a USER_AVC message
>>>>> indicating an unknown object class, and in the event that
>>>>> --handle-unknown is set to "deny", it will deny the access.
>>>>>
>>>>> It seems to me as though some sort of locking should occur in the
>>>>> selinuxfs to prevent access to the files being deleted and recreated
>>>>> during the policy load, so that userspace programs relying on them (in
>>>>> particular userspace object managers doing class lookups) get a
>>>>> consistent view of the classes, perms, booleans and capabilities in the
>>>>> loaded policy.
>>>>>
>>>>> This seems to be related to
>>>>> https://github.com/SELinuxProject/selinux-kernel/issues/42 but I believe
>>>>> it is a different case.  The deadlock in that bug seems to be related to
>>>>> the underlying filesystem functions, specifically around directory
>>>>> deletion while this is an issue the selinuxfs logic specifically. The
>>>>> above linked issue appears to have been fixed in recent upstream
>>>>> kernels, per the bug, but I have verified the issue I am discussing here
>>>>> in 5.7.0 rc3.
>>>>>
>>>>> It seems to me as though from the perspective of userspace that all of
>>>>> sel_make_policy_nodes (or at least all of each of its component
>>>>> functions) should be atomic.  There was some discussion in a previous
>>>>> thread
>>>>> (https://lore.kernel.org/selinux/20181002155810.GP32577@ZenIV.linux.org.uk/)
>>>>> around a significant refactor of policy loading in general.  It appears
>>>>> as though the direct issue there of access during the deletion has been
>>>>> resolved (commit d4f4de5e5ef8efde85febb6876cd3c8ab1631999), although the
>>>>> complete suggested fix of refactoring policy_load into two parts has not
>>>>> been done.  Would that refactor be the right approach to the problem I
>>>>> am trying to solve?  Would a patch for adding locking around the
>>>>> selinuxfs delete/recreate operation be considered? That wouldn't address
>>>>> all the concerns, (namely the potential to access a view of the policy
>>>>> that doesn't match the currently loaded policy and error recovery in the
>>>>> case that sel_make_nodes fails), but would improve the reliability of
>>>>> existing userspace object managers
>>>>>
>>>>> I'm happy to create and submit a patch, but I wanted to get the
>>>>> communities thought on the problem and correct approach first.
>>>> I think the best approach is the one sketched by viro but I recognize
>>>> that's a big lift.
>>>> Willing to help but not entirely clear on the interactions with the
>>>> dcache for atomically swapping
>>>> the new trees into place.
>>>>
>>>> I suspect this is just being exposed now due to more recent changes in
>>>> userspace to try to fully support
>>>> changes in class/permission values at runtime.  Previously userspace
>>>> object managers tended to only
>>>> map them during initialization or on first use and then would just
>>>> keep using the cached values, whereas
>>>> now they try to map them lazily and flush their caches on a reload
>>>> notification.  Some of those changes
>>>> were in libselinux and others in the userspace object managers (e.g.
>>>> systemd, dbus-broker or dbusd).
>>>> Not sure exactly what versions of libselinux, systemd, and
>>>> dbusd/dbus-broker you are using.
>>>>
>>>> selinuxfs itself does take a mutex around the entire policy load
>>>> including the delete/re-create sequence but
>>>> that only serializes with other policy operations (load, read,
>>>> booleans), not with dcache walk/lookup.
>>>>
>>>> cc'ing Ondrej since he attempted to fix the earlier selinuxfs bug and
>>>> may have an opinion on the best way
>>>> forward for this issue.
>>> Well, I attempted a few times and each time failed miserably :)
>>> Thankfully, hitting the bug via another fs eventually forced viro to
>>> fix it (in a way I could probably never come up with myself). However,
>>> I remember trying to originally fix the bug by means of making the
>>> swapover atomic, but later I realized that these two are actually
>>> independent issues. After that I didn't get back to atomizing the
>>> swapover, but IIRC I had the impression that it might not be all that
>>> difficult... (at least if I ignore possible failures during the new
>>> dentry tree creation for now - it would still be unsafe, but it would
>>> be a start). But knowing VFS, I bet when I actually try it will prove
>>> to be much more tricky ;)
>>>
>>> I'll try to find some time to sit down to it again, but at the moment
>>> I'm juggling a bunch of higher priority stuff so it might be a while
>>> before I get to it.
>>>
>> The weird thing about atomizing the swapover in isolation is that it's
>> unclear what action to take on failure.  The existing code looks like it
>> would leave a broken version of the selinuxfs laying around.  Completing
>> the swapover even on failure would give us an atomic version of the
>> current situation, but feels extra weird. If ignoring the failure case
>> as Ondrej suggests would be acceptable in a patch, that sounds like the
>> quickest way to addressing at least part of the problem (the refactor of
>> security_load_policy by itself doesn't get us much by itself, since we
>> can't roll back the selinuxfs after we've started in the current state.)
> I don't entirely understand what you mean by the above, since viro's sketch
> of a new policy load sequence does permit handling the error case in the middle
> of the new selinuxfs directory/file creation by just throwing away the
> new entries
> altogether and leaving the old policydb/sidtab + old selinuxfs in
> place.  That said
> I'm not saying you have to do the whole thing at once.

My point on that was just that without building the new selinuxfs 
entries offline, we don't have the old selinuxfs anymore to restore to, 
since it's already been deleted.  I suppose we could rebuild it at that 
point though.  Viro's approach works great once we have everything, but 
just refactoring policy load into two parts as a first step without 
modifying the selinuxfs by itself I don't think gets us very much if 
anything.

>> I'd be happy to start looking at either half, although I'd prefer to
>> start with atomizing the tree swapover to solve my immediate problem if
>> that would have a chance at getting merged by itself given the issues
>> around the failure case.
> Certainly willing to look at incremental improvements here.

I'll start looking into the atomic tree swapover first then.  If that 
goes well, I'll look into moving on to the policy load refactor.

-Daniel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Race condition during policy load in kernel
  2020-04-29 12:40         ` Daniel Burgener
@ 2020-06-10 13:40           ` Stephen Smalley
  0 siblings, 0 replies; 7+ messages in thread
From: Stephen Smalley @ 2020-06-10 13:40 UTC (permalink / raw)
  To: Daniel Burgener; +Cc: Ondrej Mosnacek, Paul Moore, selinux list, james.morris

On Wed, Apr 29, 2020 at 8:40 AM Daniel Burgener
<dburgener@linux.microsoft.com> wrote:
> I'll start looking into the atomic tree swapover first then.  If that
> goes well, I'll look into moving on to the policy load refactor.

Wanted to check in and see whether you've been able to make any
progress or need any help.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-06-10 13:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-27 20:40 Race condition during policy load in kernel Daniel Burgener
2020-04-28 18:54 ` Stephen Smalley
2020-04-28 19:34   ` Ondrej Mosnacek
2020-04-28 20:38     ` Daniel Burgener
2020-04-29 12:32       ` Stephen Smalley
2020-04-29 12:40         ` Daniel Burgener
2020-06-10 13:40           ` Stephen Smalley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.