All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Metzmacher <metze@samba.org>
To: Paul Moore <paul@paul-moore.com>,
	Pavel Begunkov <asml.silence@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-security-module@vger.kernel.org, selinux@vger.kernel.org,
	linux-audit@redhat.com, io-uring@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,
	Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>
Subject: Re: [RFC PATCH 2/9] audit,io_uring,io-wq: add some basic audit support to io_uring
Date: Wed, 26 May 2021 17:17:46 +0200	[thread overview]
Message-ID: <18823c99-7d65-0e6f-d508-a487f1b4b9e7@samba.org> (raw)
In-Reply-To: <CAHC9VhTAvcB0A2dpv1Xn7sa+Kh1n+e-dJr_8wSSRaxS4D0f9Sw@mail.gmail.com>


Am 26.05.21 um 16:38 schrieb Paul Moore:
> On Wed, May 26, 2021 at 6:19 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>> On 5/26/21 3:04 AM, Paul Moore wrote:
>>> On Tue, May 25, 2021 at 9:11 PM Jens Axboe <axboe@kernel.dk> wrote:
>>>> On 5/24/21 1:59 PM, Paul Moore wrote:
>>>>> That said, audit is not for everyone, and we have build time and
>>>>> runtime options to help make life easier.  Beyond simply disabling
>>>>> audit at compile time a number of Linux distributions effectively
>>>>> shortcut audit at runtime by adding a "never" rule to the audit
>>>>> filter, for example:
>>>>>
>>>>>  % auditctl -a task,never
>>>>
>>>> As has been brought up, the issue we're facing is that distros have
>>>> CONFIG_AUDIT=y and hence the above is the best real world case outside
>>>> of people doing custom kernels. My question would then be how much
>>>> overhead the above will add, considering it's an entry/exit call per op.
>>>> If auditctl is turned off, what is the expectation in turns of overhead?
>>>
>>> I commented on that case in my last email to Pavel, but I'll try to go
>>> over it again in a little more detail.
>>>
>>> As we discussed earlier in this thread, we can skip the req->opcode
>>> check before both the _entry and _exit calls, so we are left with just
>>> the bare audit calls in the io_uring code.  As the _entry and _exit
>>> functions are small, I've copied them and their supporting functions
>>> below and I'll try to explain what would happen in CONFIG_AUDIT=y,
>>> "task,never" case.
>>>
>>> +  static inline struct audit_context *audit_context(void)
>>> +  {
>>> +    return current->audit_context;
>>> +  }
>>>
>>> +  static inline bool audit_dummy_context(void)
>>> +  {
>>> +    void *p = audit_context();
>>> +    return !p || *(int *)p;
>>> +  }
>>>
>>> +  static inline void audit_uring_entry(u8 op)
>>> +  {
>>> +    if (unlikely(audit_enabled && audit_context()))
>>> +      __audit_uring_entry(op);
>>> +  }
>>
>> I'd rather agree that it's my cycle-picking. The case I care about
>> is CONFIG_AUDIT=y (because everybody enable it), and io_uring
>> tracing _not_ enabled at runtime. If enabled let them suffer
>> the overhead, it will probably dip down the performance
>>
>> So, for the case I care about it's two of
>>
>> if (unlikely(audit_enabled && current->audit_context))
>>
>> in the hot path. load-test-jump + current, so it will
>> be around 7x2 instructions. We can throw away audit_enabled
>> as you say systemd already enables it, that will give
>> 4x2 instructions including 2 conditional jumps.
> 
> We've basically got it down to the equivalent of two
> "current->audit_context != NULL" checks in the case where audit is
> built into the kernel but disabled at runtime, e.g. CONFIG_AUDIT=y and
> "task,never".  I'm at a loss for how we can lower the overhead any
> further, but I'm open to suggestions.
> 
>> That's not great at all. And that's why I brought up
>> the question about need of pre and post hooks and whether
>> can be combined. Would be just 4 instructions and that is
>> ok (ish).
> 
> As discussed previously in this thread that isn't really an option
> from an audit perspective.
> 
>>> We would need to check with the current security requirements (there
>>> are distro people on the linux-audit list that keep track of that
>>> stuff), but looking at the opcodes right now my gut feeling is that
>>> most of the opcodes would be considered "security relevant" so
>>> selective auditing might not be that useful in practice.  It would
>>> definitely clutter the code and increase the chances that new opcodes
>>> would not be properly audited when they are merged.
>>
>> I'm curious, why it's enabled by many distros by default? Are there
>> use cases they use?
> 
> We've already talked about certain users and environments where audit
> is an important requirement, e.g. public sector, health care,
> financial institutions, etc.; without audit Linux wouldn't be an
> option for these users, at least not without heavy modification,
> out-of-tree/ISV patches, etc.  I currently don't have any direct ties
> to any distros, "Enterprise" or otherwise, but in the past it has been
> my experience that distros much prefer to have a single kernel build
> to address the needs of all their users.  In the few cases I have seen
> where a second kernel build is supported it is usually for hardware
> enablement.  I'm sure there are other cases too, I just haven't seen
> them personally; the big distros definitely seem to have a strong
> desire to limit the number of supported kernel configs/builds.
> 
>> Tempting to add AUDIT_IOURING=default N, but won't work I guess
> 
> One of the nice things about audit is that it can give you a history
> of what a user did on a system, which is very important for a number
> of use cases.  If we selectively disable audit for certain subsystems
> we create a blind spot in the audit log, and in the case of io_uring
> this can be a very serious blind spot.  I fear that if we can't come
> to some agreement here we will need to make io_uring and audit
> mutually exclusive at build time which would be awful; forcing many
> distros to either make a hard choice or carry out-of-tree patches.

I'm wondering why it's not enough to have the native auditing just to happen.

E.g. all (I have checked RECVMSG,SENDMSG,SEND and CONNECT) socket related io_uring opcodes
already go via security_socket_{recvmsg,sendmsg,connect}()

IORING_OP_OPENAT* goes via do_filp_open() which is in common with the open[at[2]]() syscalls
and should also trigger audit_inode() and security_file_open().

So why is there anything special needed for io_uring (now that the native worker threads are used)?

Is there really any io_uring opcode that bypasses the security checks the corresponding native syscall
would do? If so, I think that should just be fixed...

Additional LSM based restrictions could be hooked into the io_check_restriction() path
and setup at io_uring_setup() or early io_uring_register() time.

What do you think?

metze

WARNING: multiple messages have this Message-ID (diff)
From: Stefan Metzmacher <metze@samba.org>
To: Paul Moore <paul@paul-moore.com>,
	Pavel Begunkov <asml.silence@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	selinux@vger.kernel.org, linux-security-module@vger.kernel.org,
	linux-audit@redhat.com,
	Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	linux-fsdevel@vger.kernel.org, io-uring@vger.kernel.org,
	Alexander Viro <viro@zeniv.linux.org.uk>
Subject: Re: [RFC PATCH 2/9] audit,io_uring,io-wq: add some basic audit support to io_uring
Date: Wed, 26 May 2021 17:17:46 +0200	[thread overview]
Message-ID: <18823c99-7d65-0e6f-d508-a487f1b4b9e7@samba.org> (raw)
In-Reply-To: <CAHC9VhTAvcB0A2dpv1Xn7sa+Kh1n+e-dJr_8wSSRaxS4D0f9Sw@mail.gmail.com>


Am 26.05.21 um 16:38 schrieb Paul Moore:
> On Wed, May 26, 2021 at 6:19 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>> On 5/26/21 3:04 AM, Paul Moore wrote:
>>> On Tue, May 25, 2021 at 9:11 PM Jens Axboe <axboe@kernel.dk> wrote:
>>>> On 5/24/21 1:59 PM, Paul Moore wrote:
>>>>> That said, audit is not for everyone, and we have build time and
>>>>> runtime options to help make life easier.  Beyond simply disabling
>>>>> audit at compile time a number of Linux distributions effectively
>>>>> shortcut audit at runtime by adding a "never" rule to the audit
>>>>> filter, for example:
>>>>>
>>>>>  % auditctl -a task,never
>>>>
>>>> As has been brought up, the issue we're facing is that distros have
>>>> CONFIG_AUDIT=y and hence the above is the best real world case outside
>>>> of people doing custom kernels. My question would then be how much
>>>> overhead the above will add, considering it's an entry/exit call per op.
>>>> If auditctl is turned off, what is the expectation in turns of overhead?
>>>
>>> I commented on that case in my last email to Pavel, but I'll try to go
>>> over it again in a little more detail.
>>>
>>> As we discussed earlier in this thread, we can skip the req->opcode
>>> check before both the _entry and _exit calls, so we are left with just
>>> the bare audit calls in the io_uring code.  As the _entry and _exit
>>> functions are small, I've copied them and their supporting functions
>>> below and I'll try to explain what would happen in CONFIG_AUDIT=y,
>>> "task,never" case.
>>>
>>> +  static inline struct audit_context *audit_context(void)
>>> +  {
>>> +    return current->audit_context;
>>> +  }
>>>
>>> +  static inline bool audit_dummy_context(void)
>>> +  {
>>> +    void *p = audit_context();
>>> +    return !p || *(int *)p;
>>> +  }
>>>
>>> +  static inline void audit_uring_entry(u8 op)
>>> +  {
>>> +    if (unlikely(audit_enabled && audit_context()))
>>> +      __audit_uring_entry(op);
>>> +  }
>>
>> I'd rather agree that it's my cycle-picking. The case I care about
>> is CONFIG_AUDIT=y (because everybody enable it), and io_uring
>> tracing _not_ enabled at runtime. If enabled let them suffer
>> the overhead, it will probably dip down the performance
>>
>> So, for the case I care about it's two of
>>
>> if (unlikely(audit_enabled && current->audit_context))
>>
>> in the hot path. load-test-jump + current, so it will
>> be around 7x2 instructions. We can throw away audit_enabled
>> as you say systemd already enables it, that will give
>> 4x2 instructions including 2 conditional jumps.
> 
> We've basically got it down to the equivalent of two
> "current->audit_context != NULL" checks in the case where audit is
> built into the kernel but disabled at runtime, e.g. CONFIG_AUDIT=y and
> "task,never".  I'm at a loss for how we can lower the overhead any
> further, but I'm open to suggestions.
> 
>> That's not great at all. And that's why I brought up
>> the question about need of pre and post hooks and whether
>> can be combined. Would be just 4 instructions and that is
>> ok (ish).
> 
> As discussed previously in this thread that isn't really an option
> from an audit perspective.
> 
>>> We would need to check with the current security requirements (there
>>> are distro people on the linux-audit list that keep track of that
>>> stuff), but looking at the opcodes right now my gut feeling is that
>>> most of the opcodes would be considered "security relevant" so
>>> selective auditing might not be that useful in practice.  It would
>>> definitely clutter the code and increase the chances that new opcodes
>>> would not be properly audited when they are merged.
>>
>> I'm curious, why it's enabled by many distros by default? Are there
>> use cases they use?
> 
> We've already talked about certain users and environments where audit
> is an important requirement, e.g. public sector, health care,
> financial institutions, etc.; without audit Linux wouldn't be an
> option for these users, at least not without heavy modification,
> out-of-tree/ISV patches, etc.  I currently don't have any direct ties
> to any distros, "Enterprise" or otherwise, but in the past it has been
> my experience that distros much prefer to have a single kernel build
> to address the needs of all their users.  In the few cases I have seen
> where a second kernel build is supported it is usually for hardware
> enablement.  I'm sure there are other cases too, I just haven't seen
> them personally; the big distros definitely seem to have a strong
> desire to limit the number of supported kernel configs/builds.
> 
>> Tempting to add AUDIT_IOURING=default N, but won't work I guess
> 
> One of the nice things about audit is that it can give you a history
> of what a user did on a system, which is very important for a number
> of use cases.  If we selectively disable audit for certain subsystems
> we create a blind spot in the audit log, and in the case of io_uring
> this can be a very serious blind spot.  I fear that if we can't come
> to some agreement here we will need to make io_uring and audit
> mutually exclusive at build time which would be awful; forcing many
> distros to either make a hard choice or carry out-of-tree patches.

I'm wondering why it's not enough to have the native auditing just to happen.

E.g. all (I have checked RECVMSG,SENDMSG,SEND and CONNECT) socket related io_uring opcodes
already go via security_socket_{recvmsg,sendmsg,connect}()

IORING_OP_OPENAT* goes via do_filp_open() which is in common with the open[at[2]]() syscalls
and should also trigger audit_inode() and security_file_open().

So why is there anything special needed for io_uring (now that the native worker threads are used)?

Is there really any io_uring opcode that bypasses the security checks the corresponding native syscall
would do? If so, I think that should just be fixed...

Additional LSM based restrictions could be hooked into the io_check_restriction() path
and setup at io_uring_setup() or early io_uring_register() time.

What do you think?

metze

--
Linux-audit mailing list
Linux-audit@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-audit


  parent reply	other threads:[~2021-05-26 15:17 UTC|newest]

Thread overview: 144+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-21 21:49 [RFC PATCH 0/9] Add LSM access controls and auditing to io_uring Paul Moore
2021-05-21 21:49 ` Paul Moore
2021-05-21 21:49 ` [RFC PATCH 1/9] audit: prepare audit_context for use in calling contexts beyond syscalls Paul Moore
2021-05-21 21:49   ` Paul Moore
2021-05-21 21:49 ` [RFC PATCH 2/9] audit,io_uring,io-wq: add some basic audit support to io_uring Paul Moore
2021-05-21 21:49   ` [RFC PATCH 2/9] audit, io_uring, io-wq: " Paul Moore
2021-05-22  0:22   ` [RFC PATCH 2/9] audit,io_uring,io-wq: " Pavel Begunkov
2021-05-22  0:22     ` Pavel Begunkov
2021-05-22  2:36     ` Paul Moore
2021-05-22  2:36       ` Paul Moore
2021-05-23 20:26       ` Pavel Begunkov
2021-05-23 20:26         ` Pavel Begunkov
2021-05-24 19:59         ` Paul Moore
2021-05-24 19:59           ` Paul Moore
2021-05-25  8:27           ` Pavel Begunkov
2021-05-25  8:27             ` Pavel Begunkov
2021-05-25 14:53             ` Paul Moore
2021-05-25 14:53               ` Paul Moore
2021-05-26  1:11           ` Jens Axboe
2021-05-26  1:11             ` Jens Axboe
2021-05-26  2:04             ` Paul Moore
2021-05-26  2:04               ` Paul Moore
2021-05-26 10:19               ` Pavel Begunkov
2021-05-26 10:19                 ` Pavel Begunkov
2021-05-26 14:38                 ` Paul Moore
2021-05-26 14:38                   ` Paul Moore
2021-05-26 15:11                   ` Steve Grubb
2021-05-26 15:11                     ` [RFC PATCH 2/9] audit, io_uring, io-wq: " Steve Grubb
2021-05-26 15:17                   ` Stefan Metzmacher [this message]
2021-05-26 15:17                     ` [RFC PATCH 2/9] audit,io_uring,io-wq: " Stefan Metzmacher
2021-05-26 15:49                     ` Richard Guy Briggs
2021-05-26 15:49                       ` Richard Guy Briggs
2021-05-26 17:22                       ` Jens Axboe
2021-05-26 17:22                         ` Jens Axboe
2021-05-27 17:27                         ` Richard Guy Briggs
2021-05-27 17:27                           ` Richard Guy Briggs
2021-05-26 15:49                     ` Victor Stewart
2021-05-26 15:49                       ` Victor Stewart
2021-05-26 16:38                       ` Casey Schaufler
2021-05-26 16:38                         ` Casey Schaufler
2021-05-26 17:15               ` Jens Axboe
2021-05-26 17:15                 ` Jens Axboe
2021-05-26 17:31                 ` Jens Axboe
2021-05-26 17:31                   ` Jens Axboe
2021-05-26 17:54                   ` Jens Axboe
2021-05-26 17:54                     ` Jens Axboe
2021-05-26 18:01                     ` Jens Axboe
2021-05-26 18:01                       ` Jens Axboe
2021-05-26 18:44                       ` Paul Moore
2021-05-26 18:44                         ` Paul Moore
2021-05-26 18:57                         ` Pavel Begunkov
2021-05-26 18:57                           ` Pavel Begunkov
2021-05-26 19:10                           ` Paul Moore
2021-05-26 19:10                             ` Paul Moore
2021-05-26 19:44                         ` Jens Axboe
2021-05-26 19:44                           ` Jens Axboe
2021-05-26 20:19                           ` Paul Moore
2021-05-26 20:19                             ` Paul Moore
2021-05-28 16:02                             ` Paul Moore
2021-05-28 16:02                               ` Paul Moore
2021-06-02  8:26                               ` Pavel Begunkov
2021-06-02  8:26                                 ` Pavel Begunkov
2021-06-02 15:46                                 ` Richard Guy Briggs
2021-06-02 15:46                                   ` Richard Guy Briggs
2021-06-03 10:39                                   ` Pavel Begunkov
2021-06-03 10:39                                     ` Pavel Begunkov
2021-06-02 19:46                                 ` Paul Moore
2021-06-02 19:46                                   ` Paul Moore
2021-06-03 10:51                                   ` Pavel Begunkov
2021-06-03 10:51                                     ` Pavel Begunkov
2021-06-03 15:54                                     ` Casey Schaufler
2021-06-03 15:54                                       ` Casey Schaufler
2021-06-03 15:54                               ` Jens Axboe
2021-06-03 15:54                                 ` Jens Axboe
2021-06-04  5:04                                 ` Paul Moore
2021-06-04  5:04                                   ` Paul Moore
2021-05-26 18:38                     ` Paul Moore
2021-05-26 18:38                       ` Paul Moore
2021-06-02 17:29   ` [RFC PATCH 2/9] audit, io_uring, io-wq: " Richard Guy Briggs
2021-06-02 17:29     ` Richard Guy Briggs
2021-06-02 20:46     ` Paul Moore
2021-06-02 20:46       ` Paul Moore
2021-08-25  1:21       ` Richard Guy Briggs
2021-08-25  1:21         ` Richard Guy Briggs
2021-08-25 19:41         ` Paul Moore
2021-08-25 19:41           ` Paul Moore
2021-05-21 21:50 ` [RFC PATCH 3/9] audit: dev/test patch to force io_uring auditing Paul Moore
2021-05-21 21:50   ` Paul Moore
2021-05-21 21:50 ` [RFC PATCH 4/9] audit: add filtering for io_uring records Paul Moore
2021-05-21 21:50   ` Paul Moore
2021-05-28 22:35   ` Richard Guy Briggs
2021-05-28 22:35     ` Richard Guy Briggs
2021-05-30 15:26     ` Paul Moore
2021-05-30 15:26       ` Paul Moore
2021-05-31 13:44       ` Richard Guy Briggs
2021-05-31 13:44         ` Richard Guy Briggs
2021-06-02  1:40         ` Paul Moore
2021-06-02  1:40           ` Paul Moore
2021-06-02 15:37           ` Richard Guy Briggs
2021-06-02 15:37             ` Richard Guy Briggs
2021-06-02 17:20             ` Paul Moore
2021-06-02 17:20               ` Paul Moore
2021-05-31 13:44       ` [PATCH 1/2] audit: add filtering for io_uring records, addendum Richard Guy Briggs
2021-05-31 13:44         ` Richard Guy Briggs
2021-05-31 16:08         ` kernel test robot
2021-05-31 16:08           ` kernel test robot
2021-05-31 16:08           ` kernel test robot
2021-05-31 17:38         ` kernel test robot
2021-05-31 17:38           ` kernel test robot
2021-05-31 17:38           ` kernel test robot
2021-06-07 23:15         ` Paul Moore
2021-06-07 23:15           ` Paul Moore
2021-06-08 12:55           ` Richard Guy Briggs
2021-06-08 12:55             ` Richard Guy Briggs
2021-06-09  2:45             ` Paul Moore
2021-06-09  2:45               ` Paul Moore
2021-05-31 13:44       ` [PATCH 2/2] audit: block PERM fields being used with io_uring filtering Richard Guy Briggs
2021-05-31 13:44         ` Richard Guy Briggs
2021-05-21 21:50 ` [RFC PATCH 5/9] fs: add anon_inode_getfile_secure() similar to anon_inode_getfd_secure() Paul Moore
2021-05-21 21:50   ` Paul Moore
2021-05-21 21:50 ` [RFC PATCH 6/9] io_uring: convert io_uring to the secure anon inode interface Paul Moore
2021-05-21 21:50   ` Paul Moore
2021-05-21 21:50 ` [RFC PATCH 7/9] lsm,io_uring: add LSM hooks to io_uring Paul Moore
2021-05-21 21:50   ` Paul Moore
2021-05-26 14:48   ` Stefan Metzmacher
2021-05-26 14:48     ` Stefan Metzmacher
2021-05-26 20:45     ` Paul Moore
2021-05-26 20:45       ` Paul Moore
2021-05-21 21:50 ` [RFC PATCH 8/9] selinux: add support for the io_uring access controls Paul Moore
2021-05-21 21:50   ` Paul Moore
2021-05-21 21:50 ` [RFC PATCH 9/9] Smack: Brutalist io_uring support with debug Paul Moore
2021-05-21 21:50   ` Paul Moore
2021-05-22  0:53 ` [RFC PATCH 0/9] Add LSM access controls and auditing to io_uring Tetsuo Handa
2021-05-22  0:53   ` Tetsuo Handa
2021-05-22  2:06   ` Paul Moore
2021-05-22  2:06     ` Paul Moore
2021-05-26 15:00 ` Jeff Moyer
2021-05-26 15:00   ` Jeff Moyer
2021-05-26 18:49   ` Paul Moore
2021-05-26 18:49     ` Paul Moore
2021-05-26 19:07     ` Jeff Moyer
2021-05-26 19:07       ` Jeff Moyer
2021-05-26 19:10       ` Paul Moore
2021-05-26 19:10         ` Paul Moore

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=18823c99-7d65-0e6f-d508-a487f1b4b9e7@samba.org \
    --to=metze@samba.org \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-audit@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=memxor@gmail.com \
    --cc=paul@paul-moore.com \
    --cc=selinux@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.