All of lore.kernel.org
 help / color / mirror / Atom feed
* Fw:Re:Fw:Re:[RFC,v1,1/1] audit: speed up syscall rule match while exiting syscall
@ 2021-01-21 14:25 yang.yang29
  2021-01-22 17:31   ` Fw:Re:Fw:Re:[RFC, v1, 1/1] " Paul Moore
  0 siblings, 1 reply; 3+ messages in thread
From: yang.yang29 @ 2021-01-21 14:25 UTC (permalink / raw)
  To: paul; +Cc: linux-audit, linux-kernel


[-- Attachment #1.1: Type: text/plain, Size: 6069 bytes --]

Thanks for reply, I have sent a new patch with better performance.
The v1 patch uses mutex() is not necessary.

Performance measurements:
1.Environment
CPU: Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
Linux kernel version: 5.11-rc3
Audit version: 2.8.4

2.Result
2.1 Syscall ratio
Run command "top" with no-load.
Add rule likes "auditctl -a always,exit -F arch=b64 -S chmod -F auid=[number]" which doesn't hit audit.
User command "perf record -Rg -t [top's pid] sleep 900" to get audit_filter_syscall()'s execute time ratio.

audit_filter_syscall() ratio with 100 rules:
before this patch: 15.29%.
after this patch: 0.13%, reduce 15.16%.

audit_filter_syscall() ratio with CIS rules:
before this patch: 2.25%.
after this patch: 1.21%%, reduce 1.04%.

audit_filter_syscall() ratio with 10 rules:
before this patch: 0.94%.
after this patch: 0.16%, reduce 0.78%.

audit_filter_syscall() ratio with 1 rule:
before this patch: 0.20%.
after this patch: 0.04%, reduce 0.16%.

Analyse:
Performance is better overall.

2.2 Syscall absolute time
Test method: 
Use ktime_get_real_ts64() in do_syscall_64() to calculate time.
Run command "chmod 777 /etc/fstab" with chown rules. Each test 10times and get average.

do_syscall_64() time with 100 rules:
before this patch: 7604ns
after this patch: 5244ns, reduce 2360ns.

do_syscall_64() time with CIS rules:
before this patch: 6710ns
after this patch: 7293ns, increase 583ns.

do_syscall_64() time with 10 rules:
before this patch: 5382ns
after this patch: 5171ns, reduce 211ns.

do_syscall_64() time with 1 rule:
before this patch: 5361ns
after this patch: 5375ns, increase 14ns.

do_syscall_64() time with no rules:
before this patch: 4735ns
after this patch: 4804ns, increase 69ns.

Analyse:
With a few rules, performance is close.
With 100 rules, performance is better, but with CIS rules performance regress. Maybe relevant to certain syscall.

2.3 Rule change absolute time 
Test method: 
Use command "time ..." to get rule change absolute time.

add 1 rule: time auditctl -a always,exit -F arch=b64 -S chmod -F auid=1
before this patch:
    real    0m 0.00s
    user    0m 0.00s
    sys     0m 0.00s
after this patch: 
    real    0m 0.00s
    user    0m 0.00s
    sys     0m 0.00s

delete 1 rule: time auditctl -d always,exit -F arch=b64 -S chmod -F auid=1
before this patch:
    real    0m 0.00s
    user    0m 0.00s
    sys     0m 0.00s
after this patch: 
    real    0m 0.00s
    user    0m 0.00s
    sys     0m 0.00s

add 100 rule: time augenrules --load
before this patch:
    real    0m 0.01s
    user    0m 0.00s
    sys     0m 0.00s
after this patch: 
    real    0m 0.02s
    user    0m 0.01s
    sys     0m 0.00s

delete 100 rule:time auditctl -D
before this patch:
    real    0m 0.00s
    user    0m 0.00s
    sys     0m 0.00s
after this patch: 
    real    0m 0.00s
    user    0m 0.00s
    sys     0m 0.00s

Analyse:
Performance regress is really trivial.


------------------Original Mail------------------
Sender: PaulMoore
To: yang yang10192021;
CC: Eric Paris;linux-audit@redhat.com;linux-kernel@vger.kernel.org;
Date: 2021/01/14 23:23
Subject: Re:Fw:Re:[RFC,v1,1/1] audit: speed up syscall rule match while exiting syscall
On Thu, Jan 14, 2021 at 8:25 AM <yang.yang29@zte.com.cn> wrote:
>
> Performance measurements:
> 1.Environment
> CPU: Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
> Linux kernel version: 5.11-rc3
> Audit version: 2.8.4
>
> 2.Result
> 2.1 Syscall invocations
> Test method:
>     Run command "top" with no-load.
>     Add rule likes "auditctl -a always,exit -F arch=b64 -S chmod -F auid=[number]" which doesn't hit audit.
>     User command "perf record -Rg -t [top's pid] sleep 900" to get audit_filter_syscall()'s execute time ratio.

Thanks for providing some performance numbers so quickly, a few
comments and thoughts below ...

> audit_filter_syscall() ratio with 100 rules:
>     before this patch: 15.29%.
>     after this patch: 0.88%, reduce 14.41%.
> audit_filter_syscall() ratio with CIS[1] rules:
>     before this patch: 2.25%.
>     after this patch: 1.93%%, reduce 0.32%.
> audit_filter_syscall() ratio with 10 rules:
>     before this patch: 0.94%.
>     after this patch: 1.02%, increase 0.08%.
> audit_filter_syscall() ratio with 1 rule:
>     before this patch: 0.20%.
>     after this patch: 0.88%, increase 0.68%.

If we assume the CIS rules to be a reasonable common case (I'm not
sure if that is correct or not, but we'll skip that discussion for
now), we see an performance improvement of 0.32% correct, yes?  We
also see a performance regression with small number of syscall rules
that equalizes above ten rules, yes?

On your system can you provide some absolute numbers?  For example,
what does 0.32% equate to in terms of wall clock time for a given
syscall invocation?

> Analyse:
>     With 1 rule, after this patch performance is worse, because mutex_lock()/mutex_unlock(). But user just add one rule seems unusual.
>     With more rule, after this patch performance is improved.Typical likes CIS benchmark.
>
> 2.2 Rule change
> Test method:
>     Use ktime_get_real_ts64() before and after audit_add_rule()/audit_del_rule() to calculate time.
>      Add/delete rule by command "auditctl". Each test 10times and get average.

In this case I'm less concerned about micro benchmarks, and more
interested in the wall clock time difference when running auditctl to
add/remove rules.  The difference here in the micro benchmark is not
trivial, but with a delta of 4~5us it is possible that it is a
small(er) percentage when compared to the total time spent executing
auditctl.

> audit_add_rule() time:
>     before this patch: 3120ns.
>     after this patch: 7783ns, increase 149%.
> audit_del_rule() time:
>     before this patch: 3510ns.
>     after this patch: 8519ns, increase 143%.
>
> Analyse:
>     After this patch, rule change time obviously increase. But rule change may not happen very often.
>
> [1] CIS is a Linux Benchmarks for security purpose.
> https://www.cisecurity.org/benchmark/distribution_independent_linux/

--
paul moore
www.paul-moore.com

[-- Attachment #2: Type: text/plain, Size: 102 bytes --]

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Fw:Re:Fw:Re:[RFC,v1,1/1] audit: speed up syscall rule match while exiting syscall
  2021-01-21 14:25 Fw:Re:Fw:Re:[RFC,v1,1/1] audit: speed up syscall rule match while exiting syscall yang.yang29
@ 2021-01-22 17:31   ` Paul Moore
  0 siblings, 0 replies; 3+ messages in thread
From: Paul Moore @ 2021-01-22 17:31 UTC (permalink / raw)
  To: yang.yang29; +Cc: linux-audit, linux-kernel

On Thu, Jan 21, 2021 at 9:25 AM <yang.yang29@zte.com.cn> wrote:
>
> Thanks for reply, I have sent a new patch with better performance.
> The v1 patch uses mutex() is not necessary.
>
> Performance measurements:
> 1.Environment
> CPU: Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
> Linux kernel version: 5.11-rc3
> Audit version: 2.8.4

...

> 2.2 Syscall absolute time
> Test method:
> Use ktime_get_real_ts64() in do_syscall_64() to calculate time.
> Run command "chmod 777 /etc/fstab" with chown rules. Each test 10times and get average.
>
> do_syscall_64() time with 100 rules:
> before this patch: 7604ns
> after this patch: 5244ns, reduce 2360ns.
>
> do_syscall_64() time with CIS rules:
> before this patch: 6710ns
> after this patch: 7293ns, increase 583ns.
>
> do_syscall_64() time with 10 rules:
> before this patch: 5382ns
> after this patch: 5171ns, reduce 211ns.
>
> do_syscall_64() time with 1 rule:
> before this patch: 5361ns
> after this patch: 5375ns, increase 14ns.
>
> do_syscall_64() time with no rules:
> before this patch: 4735ns
> after this patch: 4804ns, increase 69ns.
>
> Analyse:
> With a few rules, performance is close.
> With 100 rules, performance is better, but with CIS rules performance regress. Maybe relevant to certain syscall.

These numbers aren't particularly good in my opinion, the negative
impact of the change to a small number of rules and to the CIS ruleset
is not a good thing.  It also should be said that you are increasing
the memory footprint, even if it is relatively small.

However, if we take a step back and look at the motivation for this
work I wonder if there are some things we can do to improve the
per-syscall rule processing performance.  On thing that jumped out
just now was this code in __audit_syscall_exit():

void __audit_syscall_exit(int success, long return_code)
{

  /* ... */

  /*
   * we need to fix up the return code in the audit logs if the
   * actual return codes are later going to be fixed up by the
   * arch specific signal handlers ... */
  if (unlikely(return_code <= -ERESTARTSYS) &&
      (return_code >= -ERESTART_RESTARTBLOCK) &&
      (return_code != -ENOIOCTLCMD))
    context->return_code = -EINTR;
  else
    context->return_code  = return_code;

  audit_filter_syscall(current, context,
    &audit_filter_list[AUDIT_FILTER_EXIT]);
  audit_filter_inodes(current, context);
  if (context->current_state == AUDIT_RECORD_CONTEXT)
    audit_log_exit();
  }

... in the snippet above the audit_filter_inodes() function/rules are
given priority over the syscall rules in
audit_filter_syscall(AUDIT_FILTER_EXIT), so why not first execute
audit_filter_inodes() and only execute
audit_filter_syscall(AUDIT_FILTER_EXIT) if necessary?  It may be that
I'm missing something on this quick look at the code, but I think it
is worth investigating.  It's also possible there are other similar
improvements to made.

There is similar code in __audit_free() but that should be less
performance critical.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Fw:Re:Fw:Re:[RFC, v1, 1/1] audit: speed up syscall rule match while exiting syscall
@ 2021-01-22 17:31   ` Paul Moore
  0 siblings, 0 replies; 3+ messages in thread
From: Paul Moore @ 2021-01-22 17:31 UTC (permalink / raw)
  To: yang.yang29; +Cc: linux-audit, linux-kernel

On Thu, Jan 21, 2021 at 9:25 AM <yang.yang29@zte.com.cn> wrote:
>
> Thanks for reply, I have sent a new patch with better performance.
> The v1 patch uses mutex() is not necessary.
>
> Performance measurements:
> 1.Environment
> CPU: Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
> Linux kernel version: 5.11-rc3
> Audit version: 2.8.4

...

> 2.2 Syscall absolute time
> Test method:
> Use ktime_get_real_ts64() in do_syscall_64() to calculate time.
> Run command "chmod 777 /etc/fstab" with chown rules. Each test 10times and get average.
>
> do_syscall_64() time with 100 rules:
> before this patch: 7604ns
> after this patch: 5244ns, reduce 2360ns.
>
> do_syscall_64() time with CIS rules:
> before this patch: 6710ns
> after this patch: 7293ns, increase 583ns.
>
> do_syscall_64() time with 10 rules:
> before this patch: 5382ns
> after this patch: 5171ns, reduce 211ns.
>
> do_syscall_64() time with 1 rule:
> before this patch: 5361ns
> after this patch: 5375ns, increase 14ns.
>
> do_syscall_64() time with no rules:
> before this patch: 4735ns
> after this patch: 4804ns, increase 69ns.
>
> Analyse:
> With a few rules, performance is close.
> With 100 rules, performance is better, but with CIS rules performance regress. Maybe relevant to certain syscall.

These numbers aren't particularly good in my opinion, the negative
impact of the change to a small number of rules and to the CIS ruleset
is not a good thing.  It also should be said that you are increasing
the memory footprint, even if it is relatively small.

However, if we take a step back and look at the motivation for this
work I wonder if there are some things we can do to improve the
per-syscall rule processing performance.  On thing that jumped out
just now was this code in __audit_syscall_exit():

void __audit_syscall_exit(int success, long return_code)
{

  /* ... */

  /*
   * we need to fix up the return code in the audit logs if the
   * actual return codes are later going to be fixed up by the
   * arch specific signal handlers ... */
  if (unlikely(return_code <= -ERESTARTSYS) &&
      (return_code >= -ERESTART_RESTARTBLOCK) &&
      (return_code != -ENOIOCTLCMD))
    context->return_code = -EINTR;
  else
    context->return_code  = return_code;

  audit_filter_syscall(current, context,
    &audit_filter_list[AUDIT_FILTER_EXIT]);
  audit_filter_inodes(current, context);
  if (context->current_state == AUDIT_RECORD_CONTEXT)
    audit_log_exit();
  }

... in the snippet above the audit_filter_inodes() function/rules are
given priority over the syscall rules in
audit_filter_syscall(AUDIT_FILTER_EXIT), so why not first execute
audit_filter_inodes() and only execute
audit_filter_syscall(AUDIT_FILTER_EXIT) if necessary?  It may be that
I'm missing something on this quick look at the code, but I think it
is worth investigating.  It's also possible there are other similar
improvements to made.

There is similar code in __audit_free() but that should be less
performance critical.

-- 
paul moore
www.paul-moore.com

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-01-22 17:45 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-21 14:25 Fw:Re:Fw:Re:[RFC,v1,1/1] audit: speed up syscall rule match while exiting syscall yang.yang29
2021-01-22 17:31 ` Paul Moore
2021-01-22 17:31   ` Fw:Re:Fw:Re:[RFC, v1, 1/1] " Paul Moore

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.