* [PATCH] audit: accelerate audit rule filter
@ 2021-11-23 7:50 Zixuan Zhao
2021-11-24 15:42 ` Paul Moore
0 siblings, 1 reply; 7+ messages in thread
From: Zixuan Zhao @ 2021-11-23 7:50 UTC (permalink / raw)
To: paul, eparis; +Cc: linux-audit, linux-kernel
From: zhaozixuan <zhaozixuan2@huawei.com>
Audit traverses rules when a syscall exits until it finds a matching rule.
For syscalls not monitored by any rules, audit has to traverse all rules
to know they are not interested. This process would be repeated many times
and cause performance issues when a user adds many syscall rules. To solve
this problem, we add an array to record the number of rules interested in
a syscall. When a syscall exits, audit will check the array to decide
whether to search syscall-rules or just quit. The complexity can be
optimized from O(n) to O(1) for syscalls that are not monitored by rules.
This patch does so with the following changes:
1. We define a global array audit_syscall_rules to record the number of
rules interested in a syscall. For compatible architectures which may have
two different syscall sets, we define one more array called
audit_compat_syscall_rules.
2. When a rule is added/deleted by a user, we use the syscall_nr
interested by the rule as the index of the global array and +1/-1 the
corresponding value. Considering tree-type rules usually monitor many
syscalls which may reduce the optimization effect, we move them from
audit_filter_list[AUDIT_FILTER_EXIT] to a new rule list named
audit_filter_dir_list, and add a new function audit_filter_dir to handle
these rules.
3. Add a check in function audit_filter_syscall. If
audit_syscall_rules[major] == 0 (or audit_compat_syscall_rules[major] == 0
for compatible architecture), quit the function early.
We used lat_syscall of lmbench3 to test the performance impact of this
patch. We changed the number of rules and run lat_syscall with 1000
repetitions at each test. Syscalls measured by lat_syscall are not
monitored by rules.
Before this optimization:
null read write stat fstat open
0 rules 1.87ms 2.74ms 2.56ms 26.31ms 4.13ms 69.66ms
10 rules 2.15ms 3.13ms 3.32ms 26.99ms 4.16ms 74.70ms
20 rules 2.45ms 3.97ms 3.82ms 27.05ms 4.60ms 76.35ms
30 rules 2.64ms 4.52ms 3.95ms 30.30ms 4.94ms 78.94ms
40 rules 2.83ms 4.97ms 4.23ms 32.16ms 5.40ms 81.88ms
50 rules 3.00ms 5.30ms 4.84ms 33.49ms 5.79ms 83.20ms
100 rules 4.24ms 9.75ms 7.42ms 37.68ms 6.55ms 93.70ms
160 rules 5.50ms 16.89ms 12.18ms 51.53ms 17.45ms 155.40ms
After this optimization:
null read write stat fstat open
0 rules 1.81ms 2.84ms 2.42ms 27.70ms 4.15ms 69.10ms
10 rules 1.97ms 2.83ms 2.69ms 27.70ms 4.15ms 69.30ms
20 rules 1.72ms 2.91ms 2.41ms 26.49ms 3.91ms 71.19ms
30 rules 1.85ms 2.94ms 2.48ms 26.27ms 3.97ms 71.43ms
40 rules 1.88ms 2.94ms 2.78ms 26.85ms 4.08ms 69.79ms
50 rules 1.86ms 3.17ms 3.08ms 26.25ms 4.03ms 72.32ms
100 rules 1.84ms 3.00ms 2.81ms 26.25ms 3.98ms 70.25ms
160 rules 1.92ms 3.32ms 3.06ms 26.81ms 4.57ms 71.41ms
As the result shown above, the syscall latencies increase as the number
of rules increases, while with the patch the latencies remain stable.
This could help when a user adds many audit rules for purposes (such as
attack tracing or process behavior recording) but suffers from low
performance.
Signed-off-by: zhaozixuan <zhaozixuan2@huawei.com>
---
include/linux/audit.h | 13 +++++
kernel/audit.c | 15 ++++++
kernel/audit.h | 12 +++++
kernel/auditfilter.c | 122 +++++++++++++++++++++++++++++++++++++++++-
kernel/auditsc.c | 48 ++++++++++++++++-
5 files changed, 207 insertions(+), 3 deletions(-)
diff --git a/include/linux/audit.h b/include/linux/audit.h
index 82b7c1116a85..988b673ac66d 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -280,6 +280,19 @@ static inline int audit_signal_info(int sig, struct task_struct *t)
#define AUDIT_INODE_HIDDEN 2 /* audit record should be hidden */
#define AUDIT_INODE_NOEVAL 4 /* audit record incomplete */
+#include <asm/seccomp.h>
+#define AUDIT_ARCH_NATIVE SECCOMP_ARCH_NATIVE
+#define AUDIT_ARCH_NATIVE_NR SECCOMP_ARCH_NATIVE_NR
+#ifdef SECCOMP_ARCH_COMPAT
+#define AUDIT_ARCH_COMPAT SECCOMP_ARCH_COMPAT
+#define AUDIT_ARCH_COMPAT_NR SECCOMP_ARCH_COMPAT_NR
+#define AUDIR_ARCH_MAX_NR (SECCOMP_ARCH_NATIVE_NR > \
+ SECCOMP_ARCH_COMPAT_NR ? SECCOMP_ARCH_NATIVE_NR : \
+ SECCOMP_ARCH_COMPAT_NR)
+#else
+#define AUDIR_ARCH_MAX_NR SECCOMP_ARCH_NATIVE_NR
+#endif
+
#ifdef CONFIG_AUDITSYSCALL
#include <asm/syscall.h> /* for syscall_get_arch() */
diff --git a/kernel/audit.c b/kernel/audit.c
index 121d37e700a6..813a69bbf81a 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1638,6 +1638,10 @@ static struct pernet_operations audit_net_ops __net_initdata = {
static int __init audit_init(void)
{
int i;
+ struct audit_rule_count *arc = NULL;
+#ifdef AUDIT_ARCH_COMPAT
+ struct audit_rule_count *compat_arc = NULL;
+#endif
if (audit_initialized == AUDIT_DISABLED)
return 0;
@@ -1668,6 +1672,17 @@ static int __init audit_init(void)
panic("audit: failed to start the kauditd thread (%d)\n", err);
}
+ arc = alloc_audit_rule_count(AUDIT_ARCH_NATIVE_NR);
+ if (!arc)
+ panic("audit: failed to initialize audit_syscall_rules\n");
+ rcu_assign_pointer(audit_syscall_rules, arc);
+#ifdef AUDIT_ARCH_COMPAT
+ compat_arc = alloc_audit_rule_count(AUDIT_ARCH_COMPAT_NR);
+ if (!compat_arc)
+ panic("audit: failed to initialize audit_compat_syscall_rules\n");
+ rcu_assign_pointer(audit_compat_syscall_rules, compat_arc);
+#endif
+
audit_log(NULL, GFP_KERNEL, AUDIT_KERNEL,
"state=initialized audit_enabled=%u res=1",
audit_enabled);
diff --git a/kernel/audit.h b/kernel/audit.h
index d6a2c899a8db..7e452b9e2a30 100644
--- a/kernel/audit.h
+++ b/kernel/audit.h
@@ -199,6 +199,11 @@ struct audit_context {
struct audit_proctitle proctitle;
};
+struct audit_rule_count {
+ struct rcu_head rcu;
+ int length;
+ unsigned int counts[0];
+};
extern bool audit_ever_enabled;
extern void audit_log_session_info(struct audit_buffer *ab);
@@ -216,6 +221,13 @@ static inline int audit_hash_ino(u32 ino)
/* Indicates that audit should log the full pathname. */
#define AUDIT_NAME_FULL -1
+extern struct audit_rule_count *audit_syscall_rules;
+#ifdef AUDIT_ARCH_COMPAT
+extern struct audit_rule_count *audit_compat_syscall_rules;
+#endif
+extern inline struct audit_rule_count *alloc_audit_rule_count(int length);
+extern struct list_head audit_filter_dir_list;
+extern int audit_in_mask(const struct audit_krule *rule, unsigned long val);
extern int audit_match_class(int class, unsigned syscall);
extern int audit_comparator(const u32 left, const u32 op, const u32 right);
extern int audit_uid_comparator(kuid_t left, u32 op, kuid_t right);
diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c
index db2c6b59dfc3..53da9cf99d29 100644
--- a/kernel/auditfilter.c
+++ b/kernel/auditfilter.c
@@ -60,6 +60,12 @@ static struct list_head audit_rules_list[AUDIT_NR_FILTERS] = {
DEFINE_MUTEX(audit_filter_mutex);
+struct audit_rule_count *audit_syscall_rules;
+#ifdef AUDIT_ARCH_COMPAT
+struct audit_rule_count *audit_compat_syscall_rules;
+#endif
+LIST_HEAD(audit_filter_dir_list);
+
static void audit_free_lsm_field(struct audit_field *f)
{
switch (f->type) {
@@ -909,6 +915,8 @@ static struct audit_entry *audit_find_rule(struct audit_entry *entry,
}
}
goto out;
+ } else if (entry->rule.tree) {
+ *p = list = &audit_filter_dir_list;
} else {
*p = list = &audit_filter_list[entry->rule.listnr];
}
@@ -926,6 +934,105 @@ static struct audit_entry *audit_find_rule(struct audit_entry *entry,
static u64 prio_low = ~0ULL/2;
static u64 prio_high = ~0ULL/2 - 1;
+inline struct audit_rule_count *alloc_audit_rule_count(int length)
+{
+ struct audit_rule_count *arc = kzalloc(
+ sizeof(struct audit_rule_count) + sizeof(unsigned int) * length,
+ GFP_KERNEL);
+ if (arc)
+ arc->length = length;
+ return arc;
+}
+static int copy_rule_counts(int arch, struct audit_rule_count **old_counts,
+ struct audit_rule_count **new_counts)
+{
+ if (arch == AUDIT_ARCH_NATIVE)
+ *old_counts = rcu_dereference_protected(audit_syscall_rules,
+ lockdep_is_held(&audit_filter_mutex));
+#ifdef AUDIT_ARCH_COMPAT
+ else if (arch == AUDIT_ARCH_COMPAT)
+ *old_counts = rcu_dereference_protected(audit_compat_syscall_rules,
+ lockdep_is_held(&audit_filter_mutex));
+#endif
+ else
+ return -EINVAL;
+
+ *new_counts = alloc_audit_rule_count((*old_counts)->length);
+ if (!*new_counts)
+ return -ENOMEM;
+
+ memcpy((*new_counts)->counts,
+ (*old_counts)->counts,
+ sizeof(unsigned int) * (*old_counts)->length);
+
+ return 0;
+}
+
+static inline bool arch_monitored(struct audit_entry *entry, int arch)
+{
+ return (entry->rule.arch_f == NULL ||
+ audit_comparator(arch,
+ entry->rule.arch_f->op,
+ entry->rule.arch_f->val));
+}
+
+static int audit_update_syscall_rule(struct audit_entry *entry, int delt)
+{
+ int i = 0;
+ int err = 0;
+ struct audit_rule_count *new_counts = NULL;
+ struct audit_rule_count *old_counts = NULL;
+ bool update_native;
+#ifdef AUDIT_ARCH_COMPAT
+ struct audit_rule_count *new_compat_counts = NULL;
+ struct audit_rule_count *old_compat_counts = NULL;
+ bool update_compat;
+#endif
+ if (entry->rule.listnr != AUDIT_FILTER_EXIT || entry->rule.watch || entry->rule.tree)
+ return 0;
+
+ update_native = arch_monitored(entry, AUDIT_ARCH_NATIVE);
+ if (update_native) {
+ err = copy_rule_counts(AUDIT_ARCH_NATIVE, &old_counts, &new_counts);
+ if (err)
+ return err;
+ }
+
+#ifdef AUDIT_ARCH_COMPAT
+ update_compat = arch_monitored(entry, AUDIT_ARCH_COMPAT);
+ if (update_compat) {
+ err = copy_rule_counts(AUDIT_ARCH_COMPAT, &old_compat_counts, &new_compat_counts);
+ if (err) {
+ kfree(new_counts);
+ return err;
+ }
+ }
+#endif
+
+ for (i = 0; i < AUDIR_ARCH_MAX_NR; i++) {
+ if ((audit_in_mask(&entry->rule, i) == 0))
+ continue;
+ if (i < AUDIT_ARCH_NATIVE_NR && update_native)
+ new_counts->counts[i] += delt;
+#ifdef AUDIT_ARCH_COMPAT
+ if (i < AUDIT_ARCH_COMPAT_NR && update_compat)
+ new_compat_counts->counts[i] += delt;
+#endif
+ }
+
+ if (update_native) {
+ rcu_assign_pointer(audit_syscall_rules, new_counts);
+ kfree_rcu(old_counts, rcu);
+ }
+#ifdef AUDIT_ARCH_COMPAT
+ if (update_compat) {
+ rcu_assign_pointer(audit_compat_syscall_rules, new_counts);
+ kfree_rcu(old_compat_counts, rcu);
+ }
+#endif
+ return 0;
+}
+
/* Add rule to given filterlist if not a duplicate. */
static inline int audit_add_rule(struct audit_entry *entry)
{
@@ -957,6 +1064,15 @@ static inline int audit_add_rule(struct audit_entry *entry)
return err;
}
+ err = audit_update_syscall_rule(entry, 1);
+ if (err) {
+ mutex_unlock(&audit_filter_mutex);
+ /* normally audit_add_tree_rule() will free it on failure */
+ if (tree)
+ audit_put_tree(tree);
+ return err;
+ }
+
if (watch) {
/* audit_filter_mutex is dropped and re-taken during this call */
err = audit_add_watch(&entry->rule, &list);
@@ -994,7 +1110,7 @@ static inline int audit_add_rule(struct audit_entry *entry)
entry->rule.flags &= ~AUDIT_FILTER_PREPEND;
} else {
list_add_tail(&entry->rule.list,
- &audit_rules_list[entry->rule.listnr]);
+ &audit_rules_list[entry->rule.listnr]);
list_add_tail_rcu(&entry->list, list);
}
#ifdef CONFIG_AUDITSYSCALL
@@ -1035,6 +1151,10 @@ int audit_del_rule(struct audit_entry *entry)
goto out;
}
+ ret = audit_update_syscall_rule(e, -1);
+ if (ret)
+ goto out;
+
if (e->rule.watch)
audit_remove_watch_rule(&e->rule);
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index b1cb1dbf7417..da764328c3aa 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -789,7 +789,7 @@ static enum audit_state audit_filter_task(struct task_struct *tsk, char **key)
return AUDIT_STATE_BUILD;
}
-static int audit_in_mask(const struct audit_krule *rule, unsigned long val)
+int audit_in_mask(const struct audit_krule *rule, unsigned long val)
{
int word, bit;
@@ -805,6 +805,25 @@ static int audit_in_mask(const struct audit_krule *rule, unsigned long val)
return rule->mask[word] & bit;
}
+static bool audit_syscall_monitored(int arch, int major)
+{
+ struct audit_rule_count *arc = NULL;
+
+ if (arch == AUDIT_ARCH_NATIVE)
+ arc = rcu_dereference(audit_syscall_rules);
+#ifdef AUDIT_ARCH_COMPAT
+ else if (arch == AUDIT_ARCH_COMPAT)
+ arc = rcu_dereference(audit_compat_syscall_rules);
+#endif
+ else
+ return false;
+
+ if (major < arc->length)
+ return arc->counts[major] != 0;
+
+ return false;
+}
+
/* At syscall exit time, this filter is called if the audit_state is
* not low enough that auditing cannot take place, but is also not
* high enough that we already know we have to write an audit record
@@ -820,6 +839,11 @@ static void audit_filter_syscall(struct task_struct *tsk,
return;
rcu_read_lock();
+ if (likely(!audit_syscall_monitored(ctx->arch, ctx->major))) {
+ rcu_read_unlock();
+ return;
+ }
+
list_for_each_entry_rcu(e, &audit_filter_list[AUDIT_FILTER_EXIT], list) {
if (audit_in_mask(&e->rule, ctx->major) &&
audit_filter_rules(tsk, &e->rule, ctx, NULL,
@@ -833,6 +857,25 @@ static void audit_filter_syscall(struct task_struct *tsk,
return;
}
+static void audit_filter_dir(struct task_struct *tsk,
+ struct audit_context *ctx)
+{
+ struct audit_entry *e;
+ enum audit_state state;
+
+ rcu_read_lock();
+ list_for_each_entry_rcu(e, &audit_filter_dir_list, list) {
+ if (audit_in_mask(&e->rule, ctx->major) &&
+ audit_filter_rules(tsk, &e->rule, ctx, NULL,
+ &state, false)) {
+ rcu_read_unlock();
+ ctx->current_state = state;
+ return;
+ }
+ }
+ rcu_read_unlock();
+}
+
/*
* Given an audit_name check the inode hash table to see if they match.
* Called holding the rcu read lock to protect the use of audit_inode_hash
@@ -1638,6 +1681,7 @@ void __audit_free(struct task_struct *tsk)
context->return_code = 0;
audit_filter_syscall(tsk, context);
+ audit_filter_dir(tsk, context);
audit_filter_inodes(tsk, context);
if (context->current_state == AUDIT_STATE_RECORD)
audit_log_exit();
@@ -1719,7 +1763,6 @@ void __audit_syscall_exit(int success, long return_code)
if (!list_empty(&context->killed_trees))
audit_kill_trees(context);
-
if (!context->dummy && context->in_syscall) {
if (success)
context->return_valid = AUDITSC_SUCCESS;
@@ -1745,6 +1788,7 @@ void __audit_syscall_exit(int success, long return_code)
context->return_code = return_code;
audit_filter_syscall(current, context);
+ audit_filter_dir(current, context);
audit_filter_inodes(current, context);
if (context->current_state == AUDIT_STATE_RECORD)
audit_log_exit();
--
2.17.1
--
Linux-audit mailing list
Linux-audit@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-audit
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] audit: accelerate audit rule filter
2021-11-23 7:50 [PATCH] audit: accelerate audit rule filter Zixuan Zhao
@ 2021-11-24 15:42 ` Paul Moore
2021-11-29 7:35 ` zhaozixuan (C)
0 siblings, 1 reply; 7+ messages in thread
From: Paul Moore @ 2021-11-24 15:42 UTC (permalink / raw)
To: Zixuan Zhao; +Cc: linux-audit, linux-kernel, eparis
On Tue, Nov 23, 2021 at 2:50 AM Zixuan Zhao <zhaozixuan2@huawei.com> wrote:
> We used lat_syscall of lmbench3 to test the performance impact of this
> patch. We changed the number of rules and run lat_syscall with 1000
> repetitions at each test. Syscalls measured by lat_syscall are not
> monitored by rules.
>
> Before this optimization:
>
> null read write stat fstat open
> 0 rules 1.87ms 2.74ms 2.56ms 26.31ms 4.13ms 69.66ms
> 10 rules 2.15ms 3.13ms 3.32ms 26.99ms 4.16ms 74.70ms
> 20 rules 2.45ms 3.97ms 3.82ms 27.05ms 4.60ms 76.35ms
> 30 rules 2.64ms 4.52ms 3.95ms 30.30ms 4.94ms 78.94ms
> 40 rules 2.83ms 4.97ms 4.23ms 32.16ms 5.40ms 81.88ms
> 50 rules 3.00ms 5.30ms 4.84ms 33.49ms 5.79ms 83.20ms
> 100 rules 4.24ms 9.75ms 7.42ms 37.68ms 6.55ms 93.70ms
> 160 rules 5.50ms 16.89ms 12.18ms 51.53ms 17.45ms 155.40ms
>
> After this optimization:
>
> null read write stat fstat open
> 0 rules 1.81ms 2.84ms 2.42ms 27.70ms 4.15ms 69.10ms
> 10 rules 1.97ms 2.83ms 2.69ms 27.70ms 4.15ms 69.30ms
> 20 rules 1.72ms 2.91ms 2.41ms 26.49ms 3.91ms 71.19ms
> 30 rules 1.85ms 2.94ms 2.48ms 26.27ms 3.97ms 71.43ms
> 40 rules 1.88ms 2.94ms 2.78ms 26.85ms 4.08ms 69.79ms
> 50 rules 1.86ms 3.17ms 3.08ms 26.25ms 4.03ms 72.32ms
> 100 rules 1.84ms 3.00ms 2.81ms 26.25ms 3.98ms 70.25ms
> 160 rules 1.92ms 3.32ms 3.06ms 26.81ms 4.57ms 71.41ms
>
> As the result shown above, the syscall latencies increase as the number
> of rules increases, while with the patch the latencies remain stable.
> This could help when a user adds many audit rules for purposes (such as
> attack tracing or process behavior recording) but suffers from low
> performance.
I have general concerns about trading memory and complexity for
performance gains, but beyond that the numbers you posted above don't
yet make sense to me.
Why are the latency increases due to rule count not similar across the
different syscalls? For example, I would think that if the increase in
syscall latency was directly attributed to the audit rule processing
then the increase on the "open" syscall should be similar to that of
the "null" syscall. In other phrasing, if we can process 160 rules in
~4ms in the "null" case, why does it take us ~86ms in the "open" case?
--
paul moore
www.paul-moore.com
--
Linux-audit mailing list
Linux-audit@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-audit
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] audit: accelerate audit rule filter
2021-11-24 15:42 ` Paul Moore
@ 2021-11-29 7:35 ` zhaozixuan (C)
2021-12-01 19:39 ` Paul Moore
0 siblings, 1 reply; 7+ messages in thread
From: zhaozixuan (C) @ 2021-11-29 7:35 UTC (permalink / raw)
To: Paul Moore; +Cc: linux-audit, linux-kernel, eparis
>On Tue, Nov 23, 2021 at 2:50 AM Zixuan Zhao <zhaozixuan2@huawei.com> wrote:
>> We used lat_syscall of lmbench3 to test the performance impact of this
>> patch. We changed the number of rules and run lat_syscall with 1000
>> repetitions at each test. Syscalls measured by lat_syscall are not
>> monitored by rules.
>>
>> Before this optimization:
>>
>> null read write stat fstat open
>> 0 rules 1.87ms 2.74ms 2.56ms 26.31ms 4.13ms 69.66ms
>> 10 rules 2.15ms 3.13ms 3.32ms 26.99ms 4.16ms 74.70ms
>> 20 rules 2.45ms 3.97ms 3.82ms 27.05ms 4.60ms 76.35ms
>> 30 rules 2.64ms 4.52ms 3.95ms 30.30ms 4.94ms 78.94ms
>> 40 rules 2.83ms 4.97ms 4.23ms 32.16ms 5.40ms 81.88ms
>> 50 rules 3.00ms 5.30ms 4.84ms 33.49ms 5.79ms 83.20ms
>> 100 rules 4.24ms 9.75ms 7.42ms 37.68ms 6.55ms 93.70ms
>> 160 rules 5.50ms 16.89ms 12.18ms 51.53ms 17.45ms 155.40ms
>>
>> After this optimization:
>>
>> null read write stat fstat open
>> 0 rules 1.81ms 2.84ms 2.42ms 27.70ms 4.15ms 69.10ms
>> 10 rules 1.97ms 2.83ms 2.69ms 27.70ms 4.15ms 69.30ms
>> 20 rules 1.72ms 2.91ms 2.41ms 26.49ms 3.91ms 71.19ms
>> 30 rules 1.85ms 2.94ms 2.48ms 26.27ms 3.97ms 71.43ms
>> 40 rules 1.88ms 2.94ms 2.78ms 26.85ms 4.08ms 69.79ms
>> 50 rules 1.86ms 3.17ms 3.08ms 26.25ms 4.03ms 72.32ms
>> 100 rules 1.84ms 3.00ms 2.81ms 26.25ms 3.98ms 70.25ms
>> 160 rules 1.92ms 3.32ms 3.06ms 26.81ms 4.57ms 71.41ms
>>
>> As the result shown above, the syscall latencies increase as the
>> number of rules increases, while with the patch the latencies remain stable.
>> This could help when a user adds many audit rules for purposes (such
>> as attack tracing or process behavior recording) but suffers from low
>> performance.
>
>I have general concerns about trading memory and complexity for performance gains, but beyond that the numbers you posted above don't yet make sense to me.
Thanks for your reply.
The memory cost of this patch is less than 4KB (1820 bytes on x64 and
3640 bytes on compatible x86_64) which is trivial in many cases.
Besides, syscalls are called frequently on a system so a small
optimization could bring a good income.
>Why are the latency increases due to rule count not similar across the different syscalls? For example, I would think that if the increase in syscall latency was >directly attributed to the audit rule processing then the increase on the "open" syscall should be similar to that of the "null" syscall. In other phrasing, if we >can process 160 rules in ~4ms in the "null" case, why does it take us ~86ms in the "open" case?
As to the test result, we did some investigations and concluded two
reasons:
1. The chosen rule sets were not very suitable. Though they were not hit
by syscalls being measured, some of them were hit by other processes,
which reduced the system performance and affected the test result;
2. The routine of lat_syscall is much more complicated than we thought. It
called many other syscalls during the test, which may cause the result
not to be linear.
Due to the reasons above, we did another test. We modified audit rule sets
and made sure they wouldn't be hit at runtime. Then, we added
ktime_get_real_ts64 to auditsc.c to record the time of executing
__audit_syscall_exit. We ran "stat" syscall 10000 times for each rule set
and recorded the time interval. The result is shown below:
Before this optimization:
rule set time
0 rules 3843.96ns
1 rules 13119.08ns
10 rules 14003.13ns
20 rules 15420.18ns
30 rules 17284.84ns
40 rules 19010.67ns
50 rules 21112.63ns
100 rules 25815.02ns
130 rules 29447.09ns
After this optimization:
rule set time
0 rules 3597.78ns
1 rules 13498.73ns
10 rules 13122.57ns
20 rules 12874.88ns
30 rules 14351.99ns
40 rules 14181.07ns
50 rules 13806.45ns
100 rules 13890.85ns
130 rules 14441.45ns
As the result showed, the interval is linearly increased before
optimization while the interval remains stable after optimization. Note
that audit skips some operations if there are no rules, so there is a gap
between 0 rule and 1 rule set.
--
Linux-audit mailing list
Linux-audit@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-audit
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] audit: accelerate audit rule filter
2021-11-29 7:35 ` zhaozixuan (C)
@ 2021-12-01 19:39 ` Paul Moore
0 siblings, 0 replies; 7+ messages in thread
From: Paul Moore @ 2021-12-01 19:39 UTC (permalink / raw)
To: zhaozixuan (C); +Cc: linux-audit, linux-kernel, eparis
On Mon, Nov 29, 2021 at 2:35 AM zhaozixuan (C) <zhaozixuan2@huawei.com> wrote:
> >On Tue, Nov 23, 2021 at 2:50 AM Zixuan Zhao <zhaozixuan2@huawei.com> wrote:
> >> We used lat_syscall of lmbench3 to test the performance impact of this
> >> patch. We changed the number of rules and run lat_syscall with 1000
> >> repetitions at each test. Syscalls measured by lat_syscall are not
> >> monitored by rules.
> >>
> >> Before this optimization:
> >>
> >> null read write stat fstat open
> >> 0 rules 1.87ms 2.74ms 2.56ms 26.31ms 4.13ms 69.66ms
> >> 10 rules 2.15ms 3.13ms 3.32ms 26.99ms 4.16ms 74.70ms
> >> 20 rules 2.45ms 3.97ms 3.82ms 27.05ms 4.60ms 76.35ms
> >> 30 rules 2.64ms 4.52ms 3.95ms 30.30ms 4.94ms 78.94ms
> >> 40 rules 2.83ms 4.97ms 4.23ms 32.16ms 5.40ms 81.88ms
> >> 50 rules 3.00ms 5.30ms 4.84ms 33.49ms 5.79ms 83.20ms
> >> 100 rules 4.24ms 9.75ms 7.42ms 37.68ms 6.55ms 93.70ms
> >> 160 rules 5.50ms 16.89ms 12.18ms 51.53ms 17.45ms 155.40ms
> >>
> >> After this optimization:
> >>
> >> null read write stat fstat open
> >> 0 rules 1.81ms 2.84ms 2.42ms 27.70ms 4.15ms 69.10ms
> >> 10 rules 1.97ms 2.83ms 2.69ms 27.70ms 4.15ms 69.30ms
> >> 20 rules 1.72ms 2.91ms 2.41ms 26.49ms 3.91ms 71.19ms
> >> 30 rules 1.85ms 2.94ms 2.48ms 26.27ms 3.97ms 71.43ms
> >> 40 rules 1.88ms 2.94ms 2.78ms 26.85ms 4.08ms 69.79ms
> >> 50 rules 1.86ms 3.17ms 3.08ms 26.25ms 4.03ms 72.32ms
> >> 100 rules 1.84ms 3.00ms 2.81ms 26.25ms 3.98ms 70.25ms
> >> 160 rules 1.92ms 3.32ms 3.06ms 26.81ms 4.57ms 71.41ms
> >>
> >> As the result shown above, the syscall latencies increase as the
> >> number of rules increases, while with the patch the latencies remain stable.
> >> This could help when a user adds many audit rules for purposes (such
> >> as attack tracing or process behavior recording) but suffers from low
> >> performance.
> >
> >I have general concerns about trading memory and complexity for performance gains, but beyond that the numbers you posted above don't yet make sense to me.
>
> Thanks for your reply.
>
> The memory cost of this patch is less than 4KB (1820 bytes on x64 and
> 3640 bytes on compatible x86_64) which is trivial in many cases.
> Besides, syscalls are called frequently on a system so a small
> optimization could bring a good income.
The tradeoff still exists, even though you feel it is worthwhile.
> >Why are the latency increases due to rule count not similar across the different syscalls? For example, I would think that if the increase in syscall latency was >directly attributed to the audit rule processing then the increase on the "open" syscall should be similar to that of the "null" syscall. In other phrasing, if we >can process 160 rules in ~4ms in the "null" case, why does it take us ~86ms in the "open" case?
>
> As to the test result, we did some investigations and concluded two
> reasons:
> 1. The chosen rule sets were not very suitable. Though they were not hit
> by syscalls being measured, some of them were hit by other processes,
> which reduced the system performance and affected the test result;
> 2. The routine of lat_syscall is much more complicated than we thought. It
> called many other syscalls during the test, which may cause the result
> not to be linear.
>
> Due to the reasons above, we did another test. We modified audit rule sets
> and made sure they wouldn't be hit at runtime. Then, we added
> ktime_get_real_ts64 to auditsc.c to record the time of executing
> __audit_syscall_exit. We ran "stat" syscall 10000 times for each rule set
> and recorded the time interval. The result is shown below:
>
> Before this optimization:
>
> rule set time
> 0 rules 3843.96ns
> 1 rules 13119.08ns
> 10 rules 14003.13ns
> 20 rules 15420.18ns
> 30 rules 17284.84ns
> 40 rules 19010.67ns
> 50 rules 21112.63ns
> 100 rules 25815.02ns
> 130 rules 29447.09ns
>
> After this optimization:
>
> rule set time
> 0 rules 3597.78ns
> 1 rules 13498.73ns
> 10 rules 13122.57ns
> 20 rules 12874.88ns
> 30 rules 14351.99ns
> 40 rules 14181.07ns
> 50 rules 13806.45ns
> 100 rules 13890.85ns
> 130 rules 14441.45ns
>
> As the result showed, the interval is linearly increased before
> optimization while the interval remains stable after optimization. Note
> that audit skips some operations if there are no rules, so there is a gap
> between 0 rule and 1 rule set.
It looks like a single rule like the one below could effectively
disable this optimization, is that correct?
% auditctl -a exit,always -F uid=1001
% auditctl -l
-a always,exit -S all -F uid=1001
--
paul moore
www.paul-moore.com
--
Linux-audit mailing list
Linux-audit@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-audit
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] audit: accelerate audit rule filter
2021-12-06 2:49 ` Paul Moore
@ 2021-12-06 7:42 ` Burn Alting
0 siblings, 0 replies; 7+ messages in thread
From: Burn Alting @ 2021-12-06 7:42 UTC (permalink / raw)
To: Paul Moore, zhaozixuan (C); +Cc: linux-audit, linux-kernel, eparis
[-- Attachment #1.1: Type: text/plain, Size: 6639 bytes --]
On Sun, 2021-12-05 at 21:49 -0500, Paul Moore wrote:
> On Wed, Dec 1, 2021 at 9:25 PM zhaozixuan (C) <zhaozixuan2@huawei.com> wrote:
> > > On Mon, Nov 29, 2021 at 2:35 AM zhaozixuan (C) <zhaozixuan2@huawei.com>
> > > wrote:
> > > > > On Tue, Nov 23, 2021 at 2:50 AM Zixuan Zhao <zhaozixuan2@huawei.com>
> > > > > wrote:
> > > > > > We used lat_syscall of lmbench3 to test the performance impact ofthis
> > > > > > patch. We changed the number of rules and run lat_syscall with1000
> > > > > > repetitions at each test. Syscalls measured by lat_syscall arenot
> > > > > > monitored by rules.
> > > > > > Before this optimization:
> > > > > > null read write stat fstat open 0
> > > > > > rules 1.87ms 2.74ms 2.56ms 26.31ms 4.13ms 69.66ms 10
> > > > > > rules 2.15ms 3.13ms 3.32ms 26.99ms 4.16ms 74.70ms 20
> > > > > > rules 2.45ms 3.97ms 3.82ms 27.05ms 4.60ms 76.35ms 30
> > > > > > rules 2.64ms 4.52ms 3.95ms 30.30ms 4.94ms 78.94ms 40
> > > > > > rules 2.83ms 4.97ms 4.23ms 32.16ms 5.40ms 81.88ms 50
> > > > > > rules 3.00ms 5.30ms 4.84ms 33.49ms 5.79ms 83.20ms100
> > > > > > rules 4.24ms 9.75ms 7.42ms 37.68ms 6.55ms 93.70ms160
> > > > > > rules 5.50ms 16.89ms 12.18ms 51.53ms 17.45ms 155.40ms
> > > > > > After this optimization:
> > > > > > null read write stat fstat open 0
> > > > > > rules 1.81ms 2.84ms 2.42ms 27.70ms 4.15ms 69.10ms 10
> > > > > > rules 1.97ms 2.83ms 2.69ms 27.70ms 4.15ms 69.30ms 20
> > > > > > rules 1.72ms 2.91ms 2.41ms 26.49ms 3.91ms 71.19ms 30
> > > > > > rules 1.85ms 2.94ms 2.48ms 26.27ms 3.97ms 71.43ms 40
> > > > > > rules 1.88ms 2.94ms 2.78ms 26.85ms 4.08ms 69.79ms 50
> > > > > > rules 1.86ms 3.17ms 3.08ms 26.25ms 4.03ms 72.32ms100
> > > > > > rules 1.84ms 3.00ms 2.81ms 26.25ms 3.98ms 70.25ms160
> > > > > > rules 1.92ms 3.32ms 3.06ms 26.81ms 4.57ms 71.41ms
> > > > > > As the result shown above, the syscall latencies increase
> > > > > > as thenumber of rules increases, while with the patch the latencies
> > > > > > remain stable. This could help when a user adds many audit rules for
> > > > > > purposes(such as attack tracing or process behavior recording) but
> > > > > > suffersfrom low performance.
> > > > >
> > > > > I have general concerns about trading memory and complexity for
> > > > > performance gains, but beyond that the numbers you posted above don't yet
> > > > > make sense to me.
> > > >
> > > > Thanks for your reply.
> > > > The memory cost of this patch is less than 4KB (1820 bytes on x64 and 3640
> > > > bytes on compatible x86_64) which is trivial in many cases. Besides,
> > > > syscalls are called frequently on a system so a smalloptimization could
> > > > bring a good income.
> > >
> > > The tradeoff still exists, even though you feel it is worthwhile.
> > > > > Why are the latency increases due to rule count not similar across the
> > > > > different syscalls? For example, I would think that if the increase in
> > > > > syscall latency was > >directly attributed to the audit rule processing
> > > > > then the increase on the "open" syscall should be similar to that of the
> > > > > "null" syscall. In other phrasing, if we > >can process 160 rules in ~4ms
> > > > > in the "null" case, why does it take us ~86ms in the "open" case?
> > > >
> > > > As to the test result, we did some investigations and concluded two
> > > > reasons:1. The chosen rule sets were not very suitable. Though they were
> > > > nothit by syscalls being measured, some of them were hit by
> > > > otherprocesses, which reduced the system performance and affected the
> > > > testresult; 2. The routine of lat_syscall is much more complicated than
> > > > wethought. It called many other syscalls during the test, which maycause
> > > > the result not to be linear.
> > > > Due to the reasons above, we did another test. We modified audit
> > > > rulesets and made sure they wouldn't be hit at runtime. Then, we added
> > > > ktime_get_real_ts64 to auditsc.c to record the time of
> > > > executing__audit_syscall_exit. We ran "stat" syscall 10000 times for each
> > > > ruleset and recorded the time interval. The result is shown below:
> > > > Before this optimization:
> > > > rule set time 0 rules 3843.96ns 1 rules 13119.08ns 10
> > > > rules 14003.13ns 20 rules 15420.18ns 30 rules 17284.84ns 40
> > > > rules 19010.67ns 50 rules 21112.63ns100 rules 25815.02ns130
> > > > rules 29447.09ns
> > > > After this optimization:
> > > > rule set time 0 rules 3597.78ns 1 rules 13498.73ns 10
> > > > rules 13122.57ns 20 rules 12874.88ns 30 rules 14351.99ns 40
> > > > rules 14181.07ns 50 rules 13806.45ns100 rules 13890.85ns130
> > > > rules 14441.45ns
> > > > As the result showed, the interval is linearly increased beforeoptimization
> > > > while the interval remains stable after optimization.Note that audit skips
> > > > some operations if there are no rules, so thereis a gap between 0 rule and
> > > > 1 rule set.
> > >
> > > It looks like a single rule like the one below could effectively disable this
> > > optimization, is that correct?
> > > % auditctl -a exit,always -F uid=1001 % auditctl -l -a always,exit -S all
> > > -F uid=1001
> >
> > Yes, rules like this one which monitors all syscalls could disable the
> > optimization. The number of the global array could exponentially increase if we
> > want to handle more audit fields. However, we don't that kind of rule is
> > practical because they might generate a great number of logs and even lead to
> > log loss.
>
> Before we merge something like this I think we need a betterunderstand of typical
> audit filter rules used across the differentaudit use cases. This patch is too
> much of a band-aid to mergewithout a really good promise that it will help most of
> the real worldaudit deployments.
For a 'real world deployment, I suggestcd /usr/share/audit/sample-rules
cp 10-base-config.rules 11-loginuid.rules 12-ignore-error.rules 30-stig.rules 41-
containers.rules 43-module-load.rules 71-networking.rules /etc/audit/rules.d/
rm -f /etc/audit/rules.d/audit.rules # Remove default ruleset if not applicable
echo '-b 32768' > /etc/audit/rules.d/zzexecve.rules
echo '-a exit,always -F arch=b32 -F auid!=2147483647 -S execve -k cmds' >>
/etc/audit/rules.d/zzexecve.rules
echo '-a exit,always -F arch=b64 -F auid!=4294967295 -S execve -k cmds' >>
/etc/audit/rules.d/zzexecve.rules
[-- Attachment #1.2: Type: text/html, Size: 8623 bytes --]
[-- Attachment #2: Type: text/plain, Size: 106 bytes --]
--
Linux-audit mailing list
Linux-audit@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-audit
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] audit: accelerate audit rule filter
2021-12-02 2:25 zhaozixuan (C)
@ 2021-12-06 2:49 ` Paul Moore
2021-12-06 7:42 ` Burn Alting
0 siblings, 1 reply; 7+ messages in thread
From: Paul Moore @ 2021-12-06 2:49 UTC (permalink / raw)
To: zhaozixuan (C); +Cc: linux-audit, linux-kernel, eparis
On Wed, Dec 1, 2021 at 9:25 PM zhaozixuan (C) <zhaozixuan2@huawei.com> wrote:
> > On Mon, Nov 29, 2021 at 2:35 AM zhaozixuan (C) <zhaozixuan2@huawei.com> wrote:
> > > >On Tue, Nov 23, 2021 at 2:50 AM Zixuan Zhao <zhaozixuan2@huawei.com> wrote:
> > > >> We used lat_syscall of lmbench3 to test the performance impact of
> > > >> this patch. We changed the number of rules and run lat_syscall with
> > > >> 1000 repetitions at each test. Syscalls measured by lat_syscall are
> > > >> not monitored by rules.
> > > >>
> > > >> Before this optimization:
> > > >>
> > > >> null read write stat fstat open
> > > >> 0 rules 1.87ms 2.74ms 2.56ms 26.31ms 4.13ms 69.66ms
> > > >> 10 rules 2.15ms 3.13ms 3.32ms 26.99ms 4.16ms 74.70ms
> > > >> 20 rules 2.45ms 3.97ms 3.82ms 27.05ms 4.60ms 76.35ms
> > > >> 30 rules 2.64ms 4.52ms 3.95ms 30.30ms 4.94ms 78.94ms
> > > >> 40 rules 2.83ms 4.97ms 4.23ms 32.16ms 5.40ms 81.88ms
> > > >> 50 rules 3.00ms 5.30ms 4.84ms 33.49ms 5.79ms 83.20ms
> > > >> 100 rules 4.24ms 9.75ms 7.42ms 37.68ms 6.55ms 93.70ms
> > > >> 160 rules 5.50ms 16.89ms 12.18ms 51.53ms 17.45ms 155.40ms
> > > >>
> > > >> After this optimization:
> > > >>
> > > >> null read write stat fstat open
> > > >> 0 rules 1.81ms 2.84ms 2.42ms 27.70ms 4.15ms 69.10ms
> > > >> 10 rules 1.97ms 2.83ms 2.69ms 27.70ms 4.15ms 69.30ms
> > > >> 20 rules 1.72ms 2.91ms 2.41ms 26.49ms 3.91ms 71.19ms
> > > >> 30 rules 1.85ms 2.94ms 2.48ms 26.27ms 3.97ms 71.43ms
> > > >> 40 rules 1.88ms 2.94ms 2.78ms 26.85ms 4.08ms 69.79ms
> > > >> 50 rules 1.86ms 3.17ms 3.08ms 26.25ms 4.03ms 72.32ms
> > > >> 100 rules 1.84ms 3.00ms 2.81ms 26.25ms 3.98ms 70.25ms
> > > >> 160 rules 1.92ms 3.32ms 3.06ms 26.81ms 4.57ms 71.41ms
> > > >>
> > > >> As the result shown above, the syscall latencies increase as the
> > > >> number of rules increases, while with the patch the latencies remain stable.
> > > >> This could help when a user adds many audit rules for purposes
> > > >> (such as attack tracing or process behavior recording) but suffers
> > > >> from low performance.
> > > >
> > > >I have general concerns about trading memory and complexity for performance gains, but beyond that the numbers you posted above don't yet make sense to me.
> > >
> > > Thanks for your reply.
> > >
> > > The memory cost of this patch is less than 4KB (1820 bytes on x64 and
> > > 3640 bytes on compatible x86_64) which is trivial in many cases.
> > > Besides, syscalls are called frequently on a system so a small
> > > optimization could bring a good income.
> >
> > The tradeoff still exists, even though you feel it is worthwhile.
> >
> > > >Why are the latency increases due to rule count not similar across the different syscalls? For example, I would think that if the increase in syscall latency was > >directly attributed to the audit rule processing then the increase on the "open" syscall should be similar to that of the "null" syscall. In other phrasing, if we > >can process 160 rules in ~4ms in the "null" case, why does it take us ~86ms in the "open" case?
> > >
> > > As to the test result, we did some investigations and concluded two
> > > reasons:
> > > 1. The chosen rule sets were not very suitable. Though they were not
> > > hit by syscalls being measured, some of them were hit by other
> > > processes, which reduced the system performance and affected the test
> > > result; 2. The routine of lat_syscall is much more complicated than we
> > > thought. It called many other syscalls during the test, which may
> > > cause the result not to be linear.
> > >
> > > Due to the reasons above, we did another test. We modified audit rule
> > > sets and made sure they wouldn't be hit at runtime. Then, we added
> > > ktime_get_real_ts64 to auditsc.c to record the time of executing
> > > __audit_syscall_exit. We ran "stat" syscall 10000 times for each rule
> > > set and recorded the time interval. The result is shown below:
> > >
> > > Before this optimization:
> > >
> > > rule set time
> > > 0 rules 3843.96ns
> > > 1 rules 13119.08ns
> > > 10 rules 14003.13ns
> > > 20 rules 15420.18ns
> > > 30 rules 17284.84ns
> > > 40 rules 19010.67ns
> > > 50 rules 21112.63ns
> > > 100 rules 25815.02ns
> > > 130 rules 29447.09ns
> > >
> > > After this optimization:
> > >
> > > rule set time
> > > 0 rules 3597.78ns
> > > 1 rules 13498.73ns
> > > 10 rules 13122.57ns
> > > 20 rules 12874.88ns
> > > 30 rules 14351.99ns
> > > 40 rules 14181.07ns
> > > 50 rules 13806.45ns
> > > 100 rules 13890.85ns
> > > 130 rules 14441.45ns
> > >
> > > As the result showed, the interval is linearly increased before
> > > optimization while the interval remains stable after optimization.
> > > Note that audit skips some operations if there are no rules, so there
> > > is a gap between 0 rule and 1 rule set.
> >
> > It looks like a single rule like the one below could effectively disable this optimization, is that correct?
> >
> > % auditctl -a exit,always -F uid=1001
> > % auditctl -l
> > -a always,exit -S all -F uid=1001
>
> Yes, rules like this one which monitors all syscalls could disable the
> optimization. The number of the global array could exponentially increase
> if we want to handle more audit fields. However, we don't that kind of
> rule is practical because they might generate a great number of logs and
> even lead to log loss.
Before we merge something like this I think we need a better
understand of typical audit filter rules used across the different
audit use cases. This patch is too much of a band-aid to merge
without a really good promise that it will help most of the real world
audit deployments.
--
paul moore
www.paul-moore.com
--
Linux-audit mailing list
Linux-audit@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-audit
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] audit: accelerate audit rule filter
@ 2021-12-02 2:25 zhaozixuan (C)
2021-12-06 2:49 ` Paul Moore
0 siblings, 1 reply; 7+ messages in thread
From: zhaozixuan (C) @ 2021-12-02 2:25 UTC (permalink / raw)
To: Paul Moore; +Cc: linux-audit, linux-kernel, eparis
> On Mon, Nov 29, 2021 at 2:35 AM zhaozixuan (C) <zhaozixuan2@huawei.com> wrote:
> > >On Tue, Nov 23, 2021 at 2:50 AM Zixuan Zhao <zhaozixuan2@huawei.com> wrote:
> > >> We used lat_syscall of lmbench3 to test the performance impact of
> > >> this patch. We changed the number of rules and run lat_syscall with
> > >> 1000 repetitions at each test. Syscalls measured by lat_syscall are
> > >> not monitored by rules.
> > >>
> > >> Before this optimization:
> > >>
> > >> null read write stat fstat open
> > >> 0 rules 1.87ms 2.74ms 2.56ms 26.31ms 4.13ms 69.66ms
> > >> 10 rules 2.15ms 3.13ms 3.32ms 26.99ms 4.16ms 74.70ms
> > >> 20 rules 2.45ms 3.97ms 3.82ms 27.05ms 4.60ms 76.35ms
> > >> 30 rules 2.64ms 4.52ms 3.95ms 30.30ms 4.94ms 78.94ms
> > >> 40 rules 2.83ms 4.97ms 4.23ms 32.16ms 5.40ms 81.88ms
> > >> 50 rules 3.00ms 5.30ms 4.84ms 33.49ms 5.79ms 83.20ms
> > >> 100 rules 4.24ms 9.75ms 7.42ms 37.68ms 6.55ms 93.70ms
> > >> 160 rules 5.50ms 16.89ms 12.18ms 51.53ms 17.45ms 155.40ms
> > >>
> > >> After this optimization:
> > >>
> > >> null read write stat fstat open
> > >> 0 rules 1.81ms 2.84ms 2.42ms 27.70ms 4.15ms 69.10ms
> > >> 10 rules 1.97ms 2.83ms 2.69ms 27.70ms 4.15ms 69.30ms
> > >> 20 rules 1.72ms 2.91ms 2.41ms 26.49ms 3.91ms 71.19ms
> > >> 30 rules 1.85ms 2.94ms 2.48ms 26.27ms 3.97ms 71.43ms
> > >> 40 rules 1.88ms 2.94ms 2.78ms 26.85ms 4.08ms 69.79ms
> > >> 50 rules 1.86ms 3.17ms 3.08ms 26.25ms 4.03ms 72.32ms
> > >> 100 rules 1.84ms 3.00ms 2.81ms 26.25ms 3.98ms 70.25ms
> > >> 160 rules 1.92ms 3.32ms 3.06ms 26.81ms 4.57ms 71.41ms
> > >>
> > >> As the result shown above, the syscall latencies increase as the
> > >> number of rules increases, while with the patch the latencies remain stable.
> > >> This could help when a user adds many audit rules for purposes
> > >> (such as attack tracing or process behavior recording) but suffers
> > >> from low performance.
> > >
> > >I have general concerns about trading memory and complexity for performance gains, but beyond that the numbers you posted above don't yet make sense to me.
> >
> > Thanks for your reply.
> >
> > The memory cost of this patch is less than 4KB (1820 bytes on x64 and
> > 3640 bytes on compatible x86_64) which is trivial in many cases.
> > Besides, syscalls are called frequently on a system so a small
> > optimization could bring a good income.
>
> The tradeoff still exists, even though you feel it is worthwhile.
>
> > >Why are the latency increases due to rule count not similar across the different syscalls? For example, I would think that if the increase in syscall latency was > >directly attributed to the audit rule processing then the increase on the "open" syscall should be similar to that of the "null" syscall. In other phrasing, if we > >can process 160 rules in ~4ms in the "null" case, why does it take us ~86ms in the "open" case?
> >
> > As to the test result, we did some investigations and concluded two
> > reasons:
> > 1. The chosen rule sets were not very suitable. Though they were not
> > hit by syscalls being measured, some of them were hit by other
> > processes, which reduced the system performance and affected the test
> > result; 2. The routine of lat_syscall is much more complicated than we
> > thought. It called many other syscalls during the test, which may
> > cause the result not to be linear.
> >
> > Due to the reasons above, we did another test. We modified audit rule
> > sets and made sure they wouldn't be hit at runtime. Then, we added
> > ktime_get_real_ts64 to auditsc.c to record the time of executing
> > __audit_syscall_exit. We ran "stat" syscall 10000 times for each rule
> > set and recorded the time interval. The result is shown below:
> >
> > Before this optimization:
> >
> > rule set time
> > 0 rules 3843.96ns
> > 1 rules 13119.08ns
> > 10 rules 14003.13ns
> > 20 rules 15420.18ns
> > 30 rules 17284.84ns
> > 40 rules 19010.67ns
> > 50 rules 21112.63ns
> > 100 rules 25815.02ns
> > 130 rules 29447.09ns
> >
> > After this optimization:
> >
> > rule set time
> > 0 rules 3597.78ns
> > 1 rules 13498.73ns
> > 10 rules 13122.57ns
> > 20 rules 12874.88ns
> > 30 rules 14351.99ns
> > 40 rules 14181.07ns
> > 50 rules 13806.45ns
> > 100 rules 13890.85ns
> > 130 rules 14441.45ns
> >
> > As the result showed, the interval is linearly increased before
> > optimization while the interval remains stable after optimization.
> > Note that audit skips some operations if there are no rules, so there
> > is a gap between 0 rule and 1 rule set.
>
> It looks like a single rule like the one below could effectively disable this optimization, is that correct?
>
> % auditctl -a exit,always -F uid=1001
> % auditctl -l
> -a always,exit -S all -F uid=1001
Yes, rules like this one which monitors all syscalls could disable the
optimization. The number of the global array could exponentially increase
if we want to handle more audit fields. However, we don't that kind of
rule is practical because they might generate a great number of logs and
even lead to log loss.
--
Linux-audit mailing list
Linux-audit@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-audit
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-12-06 7:46 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-23 7:50 [PATCH] audit: accelerate audit rule filter Zixuan Zhao
2021-11-24 15:42 ` Paul Moore
2021-11-29 7:35 ` zhaozixuan (C)
2021-12-01 19:39 ` Paul Moore
2021-12-02 2:25 zhaozixuan (C)
2021-12-06 2:49 ` Paul Moore
2021-12-06 7:42 ` Burn Alting
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).