From: Paul Moore <paul@paul-moore.com> To: Daniel Borkmann <daniel@iogearbox.net> Cc: Ondrej Mosnacek <omosnace@redhat.com>, linux-security-module@vger.kernel.org, James Morris <jmorris@namei.org>, Steven Rostedt <rostedt@goodmis.org>, Ingo Molnar <mingo@redhat.com>, Stephen Smalley <stephen.smalley.work@gmail.com>, selinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-fsdevel@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Casey Schaufler <casey@schaufler-ca.com>, jolsa@redhat.com, ast@kernel.org, andrii@kernel.org, davem@davemloft.net, kuba@kernel.org, Linus Torvalds <torvalds@linux-foundation.org> Subject: Re: [PATCH v2] lockdown,selinux: avoid bogus SELinux lockdown permission checks Date: Fri, 4 Jun 2021 00:50:47 -0400 [thread overview] Message-ID: <CAHC9VhS=BeGdaAi8Ae5Fx42Fzy_ybkcXwMNcPwK=uuA6=+SRcg@mail.gmail.com> (raw) In-Reply-To: <f4373013-88fb-b839-aaaa-3826548ebd0c@iogearbox.net> On Thu, Jun 3, 2021 at 2:53 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > On 6/2/21 5:13 PM, Paul Moore wrote: > [...] > > Help me out here, is your answer that the access check can only be > > done at BPF program load time? That isn't really a solution from a > > SELinux perspective as far as I'm concerned. > > That is the current answer. The unfortunate irony is that 59438b46471a > ("security,lockdown,selinux: implement SELinux lockdown") broke this in > the first place. W/o the SELinux hook implementation it would have been > working just fine at runtime, but given it's UAPI since quite a while > now, that ship has sailed. Explaining the other side of the "unfortunate irony ..." comment is going to take us in a direction that isn't very constructive so I'm going to skip past that now and simply say that if there was better cooperation across subsystems, especially with the LSM folks, a lot of this pain could be mitigated. ... and yes I said "mitigated", I'm not foolish to think the pain could be avoided entirely ;) > > I understand the ideas I've tossed out aren't practical from a BPF > > perspective, but it would be nice if we could find something that does > > work. Surely you BPF folks can think of some way to provide a > > runtime, not load time, check? > > I did run this entire discussion by both of the other BPF co-maintainers > (Alexei, Andrii, CC'ed) and together we did further brainstorming on the > matter on how we could solve this, but couldn't find a sensible & clean > solution so far. Before I jump into the patch below I just want to say that I appreciate you looking into solutions on the BPF side of things. However, I voted "no" on this patch previously and since you haven't really changed it, my "no"/NACK vote remains, at least until we exhaust a few more options. > [PATCH] bpf, lockdown, audit: Fix buggy SELinux lockdown permission checks > > Commit 59438b46471a ("security,lockdown,selinux: implement SELinux lockdown") > added an implementation of the locked_down LSM hook to SELinux, with the aim > to restrict which domains are allowed to perform operations that would breach > lockdown. This is indirectly also getting audit subsystem involved to report > events. The latter is problematic, as reported by Ondrej and Serhei, since it > can bring down the whole system via audit: > > 1) The audit events that are triggered due to calls to security_locked_down() > can OOM kill a machine, see below details [0]. > > 2) It also seems to be causing a deadlock via avc_has_perm()/slow_avc_audit() > when trying to wake up kauditd, for example, when using trace_sched_switch() > tracepoint, see details in [1]. Triggering this was not via some hypothetical > corner case, but with existing tools like runqlat & runqslower from bcc, for > example, which make use of this tracepoint. Rough call sequence goes like: > > rq_lock(rq) -> -------------------------+ > trace_sched_switch() -> | > bpf_prog_xyz() -> +-> deadlock > selinux_lockdown() -> | > audit_log_end() -> | > wake_up_interruptible() -> | > try_to_wake_up() -> | > rq_lock(rq) --------------+ Since BPF is a bit of chaotic nightmare in the sense that it basically out-of-tree kernel code that can be called from anywhere and do pretty much anything; it presents quite the challenge for those of us worried about LSM access controls. You and the other BPF folks have investigated ways in which BPF might be able to disable helper functions allowing us to do proper runtime access checks but haven't been able to make it work, which brings this patch up yet again. I'm not a fan of this patch as it basically allows BPF programs to side-step any changes to the security policy once the BPF programs have been loaded; this is Not Good. So let's look at this from a different angle. Let's look at the two problems you mention above. If we start with the runqueue deadlock we see the main problem is that audit_log_end() pokes the kauditd_wait waitqueue to ensure the kauditd_thread thread wakes up and processes the audit queue. The audit_log_start() function does something similar, but it is conditional on a number of factors and isn't as likely to be hit. If we relocate these kauditd wakeup calls we can remove the deadlock in trace_sched_switch(). In the case of CONFIG_AUDITSYSCALL=y we can probably just move the wakeup to __audit_syscall_exit() and in the case of CONFIG_AUDITSYSCALL=n we can likely just change the wait_event_freezable() call in kauditd_thread to a wait_event_freezable_timeout() call with a HZ timeout (the audit stream will be much less on these systems anyway so a queue overflow is much less likely). I'm building a kernel with these changes now, I should have something to test when I wake up tomorrow morning. It might even provide a bit of a performance boost as we would only be calling a wakeup function once for each syscall. The other issue is related to security_locked_down() and using the right subject for the access control check. As has been pointed out several times in this thread, the current code uses the current() task as the subject, which is arguably incorrect for many of the BPF helper functions. In the case of BPF, we have talked about using the credentials of the task which loaded the BPF program instead of current(), and that does make a certain amount of sense. Such an approach should make the security policy easier to develop and rationalize, leading to a significant decrease in audit records coming from LSM access denials. The question is how to implement such a change. The current SELinux security_bpf_prog_alloc() hook causes the newly loaded BPF program to inherit the subject context from the task which loads the BPF program; if it is possible to reference the bpf_prog struct, or really just the associated bpf_prog_aux->security blob, from inside a security_bpf_locked_down() function we use that subject information to perform the access check. BPF folks, is there a way to get that information from within a BPF kernel helper function? If it isn't currently possible, could it be made possible (or something similar)? If it turns out we can do both of these things (relocated audit wakeup, bpf_prog reference inside kernel helpers) I think we can arrive at a fix which both groups can accept. -- paul moore www.paul-moore.com
WARNING: multiple messages have this Message-ID (diff)
From: Paul Moore <paul@paul-moore.com> To: Daniel Borkmann <daniel@iogearbox.net> Cc: jolsa@redhat.com, ast@kernel.org, selinux@vger.kernel.org, netdev@vger.kernel.org, Stephen Smalley <stephen.smalley.work@gmail.com>, Linus Torvalds <torvalds@linux-foundation.org>, andrii@kernel.org, Ondrej Mosnacek <omosnace@redhat.com>, Steven Rostedt <rostedt@goodmis.org>, James Morris <jmorris@namei.org>, Casey Schaufler <casey@schaufler-ca.com>, linux-security-module@vger.kernel.org, Ingo Molnar <mingo@redhat.com>, linux-fsdevel@vger.kernel.org, kuba@kernel.org, bpf@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, davem@davemloft.net, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] lockdown,selinux: avoid bogus SELinux lockdown permission checks Date: Fri, 4 Jun 2021 00:50:47 -0400 [thread overview] Message-ID: <CAHC9VhS=BeGdaAi8Ae5Fx42Fzy_ybkcXwMNcPwK=uuA6=+SRcg@mail.gmail.com> (raw) In-Reply-To: <f4373013-88fb-b839-aaaa-3826548ebd0c@iogearbox.net> On Thu, Jun 3, 2021 at 2:53 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > On 6/2/21 5:13 PM, Paul Moore wrote: > [...] > > Help me out here, is your answer that the access check can only be > > done at BPF program load time? That isn't really a solution from a > > SELinux perspective as far as I'm concerned. > > That is the current answer. The unfortunate irony is that 59438b46471a > ("security,lockdown,selinux: implement SELinux lockdown") broke this in > the first place. W/o the SELinux hook implementation it would have been > working just fine at runtime, but given it's UAPI since quite a while > now, that ship has sailed. Explaining the other side of the "unfortunate irony ..." comment is going to take us in a direction that isn't very constructive so I'm going to skip past that now and simply say that if there was better cooperation across subsystems, especially with the LSM folks, a lot of this pain could be mitigated. ... and yes I said "mitigated", I'm not foolish to think the pain could be avoided entirely ;) > > I understand the ideas I've tossed out aren't practical from a BPF > > perspective, but it would be nice if we could find something that does > > work. Surely you BPF folks can think of some way to provide a > > runtime, not load time, check? > > I did run this entire discussion by both of the other BPF co-maintainers > (Alexei, Andrii, CC'ed) and together we did further brainstorming on the > matter on how we could solve this, but couldn't find a sensible & clean > solution so far. Before I jump into the patch below I just want to say that I appreciate you looking into solutions on the BPF side of things. However, I voted "no" on this patch previously and since you haven't really changed it, my "no"/NACK vote remains, at least until we exhaust a few more options. > [PATCH] bpf, lockdown, audit: Fix buggy SELinux lockdown permission checks > > Commit 59438b46471a ("security,lockdown,selinux: implement SELinux lockdown") > added an implementation of the locked_down LSM hook to SELinux, with the aim > to restrict which domains are allowed to perform operations that would breach > lockdown. This is indirectly also getting audit subsystem involved to report > events. The latter is problematic, as reported by Ondrej and Serhei, since it > can bring down the whole system via audit: > > 1) The audit events that are triggered due to calls to security_locked_down() > can OOM kill a machine, see below details [0]. > > 2) It also seems to be causing a deadlock via avc_has_perm()/slow_avc_audit() > when trying to wake up kauditd, for example, when using trace_sched_switch() > tracepoint, see details in [1]. Triggering this was not via some hypothetical > corner case, but with existing tools like runqlat & runqslower from bcc, for > example, which make use of this tracepoint. Rough call sequence goes like: > > rq_lock(rq) -> -------------------------+ > trace_sched_switch() -> | > bpf_prog_xyz() -> +-> deadlock > selinux_lockdown() -> | > audit_log_end() -> | > wake_up_interruptible() -> | > try_to_wake_up() -> | > rq_lock(rq) --------------+ Since BPF is a bit of chaotic nightmare in the sense that it basically out-of-tree kernel code that can be called from anywhere and do pretty much anything; it presents quite the challenge for those of us worried about LSM access controls. You and the other BPF folks have investigated ways in which BPF might be able to disable helper functions allowing us to do proper runtime access checks but haven't been able to make it work, which brings this patch up yet again. I'm not a fan of this patch as it basically allows BPF programs to side-step any changes to the security policy once the BPF programs have been loaded; this is Not Good. So let's look at this from a different angle. Let's look at the two problems you mention above. If we start with the runqueue deadlock we see the main problem is that audit_log_end() pokes the kauditd_wait waitqueue to ensure the kauditd_thread thread wakes up and processes the audit queue. The audit_log_start() function does something similar, but it is conditional on a number of factors and isn't as likely to be hit. If we relocate these kauditd wakeup calls we can remove the deadlock in trace_sched_switch(). In the case of CONFIG_AUDITSYSCALL=y we can probably just move the wakeup to __audit_syscall_exit() and in the case of CONFIG_AUDITSYSCALL=n we can likely just change the wait_event_freezable() call in kauditd_thread to a wait_event_freezable_timeout() call with a HZ timeout (the audit stream will be much less on these systems anyway so a queue overflow is much less likely). I'm building a kernel with these changes now, I should have something to test when I wake up tomorrow morning. It might even provide a bit of a performance boost as we would only be calling a wakeup function once for each syscall. The other issue is related to security_locked_down() and using the right subject for the access control check. As has been pointed out several times in this thread, the current code uses the current() task as the subject, which is arguably incorrect for many of the BPF helper functions. In the case of BPF, we have talked about using the credentials of the task which loaded the BPF program instead of current(), and that does make a certain amount of sense. Such an approach should make the security policy easier to develop and rationalize, leading to a significant decrease in audit records coming from LSM access denials. The question is how to implement such a change. The current SELinux security_bpf_prog_alloc() hook causes the newly loaded BPF program to inherit the subject context from the task which loads the BPF program; if it is possible to reference the bpf_prog struct, or really just the associated bpf_prog_aux->security blob, from inside a security_bpf_locked_down() function we use that subject information to perform the access check. BPF folks, is there a way to get that information from within a BPF kernel helper function? If it isn't currently possible, could it be made possible (or something similar)? If it turns out we can do both of these things (relocated audit wakeup, bpf_prog reference inside kernel helpers) I think we can arrive at a fix which both groups can accept. -- paul moore www.paul-moore.com
next prev parent reply other threads:[~2021-06-04 4:52 UTC|newest] Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-05-17 9:20 [PATCH v2] lockdown,selinux: avoid bogus SELinux lockdown permission checks Ondrej Mosnacek 2021-05-17 9:20 ` [PATCH v2] lockdown, selinux: " Ondrej Mosnacek 2021-05-17 11:00 ` [PATCH v2] lockdown,selinux: " Michael Ellerman 2021-05-17 11:00 ` Michael Ellerman 2021-05-26 11:44 ` Ondrej Mosnacek 2021-05-26 11:44 ` Ondrej Mosnacek 2021-05-27 4:28 ` James Morris 2021-05-27 4:28 ` James Morris 2021-05-27 14:18 ` Paul Moore 2021-05-27 14:18 ` Paul Moore 2021-05-28 1:37 ` Paul Moore 2021-05-28 1:37 ` Paul Moore 2021-05-28 7:09 ` Daniel Borkmann 2021-05-28 7:09 ` Daniel Borkmann 2021-05-28 9:53 ` Jiri Olsa 2021-05-28 9:53 ` Jiri Olsa 2021-05-28 9:56 ` Daniel Borkmann 2021-05-28 9:56 ` Daniel Borkmann 2021-05-28 10:16 ` Jiri Olsa 2021-05-28 10:16 ` Jiri Olsa 2021-05-28 11:47 ` Jiri Olsa 2021-05-28 11:47 ` Jiri Olsa 2021-05-28 11:54 ` Daniel Borkmann 2021-05-28 11:54 ` Daniel Borkmann 2021-05-28 13:42 ` Ondrej Mosnacek 2021-05-28 13:42 ` Ondrej Mosnacek 2021-05-28 14:20 ` Daniel Borkmann 2021-05-28 14:20 ` Daniel Borkmann 2021-05-28 15:54 ` Paul Moore 2021-05-28 15:54 ` Paul Moore 2021-05-28 15:47 ` Paul Moore 2021-05-28 15:47 ` Paul Moore 2021-05-28 18:10 ` Daniel Borkmann 2021-05-28 18:10 ` Daniel Borkmann 2021-05-28 22:52 ` Paul Moore 2021-05-28 22:52 ` Paul Moore 2021-05-29 18:48 ` Paul Moore 2021-05-29 18:48 ` Paul Moore 2021-05-31 8:24 ` Daniel Borkmann 2021-05-31 8:24 ` Daniel Borkmann 2021-06-01 20:47 ` Paul Moore 2021-06-01 20:47 ` Paul Moore 2021-06-02 12:40 ` Daniel Borkmann 2021-06-02 12:40 ` Daniel Borkmann 2021-06-02 15:13 ` Paul Moore 2021-06-02 15:13 ` Paul Moore 2021-06-03 18:52 ` Daniel Borkmann 2021-06-03 18:52 ` Daniel Borkmann 2021-06-04 4:50 ` Paul Moore [this message] 2021-06-04 4:50 ` Paul Moore 2021-06-04 18:02 ` Daniel Borkmann 2021-06-04 18:02 ` Daniel Borkmann 2021-06-04 23:34 ` Paul Moore 2021-06-04 23:34 ` Paul Moore 2021-06-05 0:08 ` Alexei Starovoitov 2021-06-05 0:08 ` Alexei Starovoitov 2021-06-05 18:10 ` Casey Schaufler 2021-06-05 18:10 ` Casey Schaufler 2021-06-05 18:17 ` Linus Torvalds 2021-06-05 18:17 ` Linus Torvalds 2021-06-06 2:11 ` Paul Moore 2021-06-06 2:11 ` Paul Moore 2021-06-06 1:30 ` Paul Moore 2021-06-06 1:30 ` Paul Moore 2021-06-02 13:39 ` Ondrej Mosnacek 2021-06-02 13:39 ` Ondrej Mosnacek 2021-06-03 17:46 ` Paul Moore 2021-06-03 17:46 ` Paul Moore 2021-06-08 11:01 ` Ondrej Mosnacek 2021-06-08 11:01 ` Ondrej Mosnacek 2021-06-09 2:40 ` Paul Moore 2021-06-09 2:40 ` Paul Moore 2021-05-28 13:58 ` Steven Rostedt 2021-05-28 13:58 ` Steven Rostedt
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAHC9VhS=BeGdaAi8Ae5Fx42Fzy_ybkcXwMNcPwK=uuA6=+SRcg@mail.gmail.com' \ --to=paul@paul-moore.com \ --cc=andrii@kernel.org \ --cc=ast@kernel.org \ --cc=bpf@vger.kernel.org \ --cc=casey@schaufler-ca.com \ --cc=daniel@iogearbox.net \ --cc=davem@davemloft.net \ --cc=jmorris@namei.org \ --cc=jolsa@redhat.com \ --cc=kuba@kernel.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-security-module@vger.kernel.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=mingo@redhat.com \ --cc=netdev@vger.kernel.org \ --cc=omosnace@redhat.com \ --cc=rostedt@goodmis.org \ --cc=selinux@vger.kernel.org \ --cc=stephen.smalley.work@gmail.com \ --cc=torvalds@linux-foundation.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.