* [PATCH v4] selinux: cache the SID -> context string translation
@ 2019-11-11 15:40 Ondrej Mosnacek
2019-11-12 15:23 ` Stephen Smalley
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Ondrej Mosnacek @ 2019-11-11 15:40 UTC (permalink / raw)
To: selinux, Paul Moore
Cc: Stephen Smalley, rcu, Paul E. McKenney, Michal Sekletar
Translating a context struct to string can be quite slow, especially if
the context has a lot of category bits set. This can cause quite
noticeable performance impact in situations where the translation needs
to be done repeatedly. A common example is a UNIX datagram socket with
the SO_PASSSEC option enabled, which is used e.g. by systemd-journald
when receiving log messages via datagram socket. This scenario can be
reproduced with:
cat /dev/urandom | base64 | logger &
timeout 30s perf record -p $(pidof systemd-journald) -a -g
kill %1
perf report -g none --pretty raw | grep security_secid_to_secctx
Before the caching introduced by this patch, computing the context
string (security_secid_to_secctx() function) takes up ~65% of
systemd-journald's CPU time (assuming a context with 1024 categories
set and Fedora x86_64 release kernel configs). After this patch
(assuming near-perfect cache hit ratio) this overhead is reduced to just
~2%.
This patch addresses the issue by caching a certain number (compile-time
configurable) of recently used context strings to speed up repeated
translations of the same context, while using only a small amount of
memory.
The cache is integrated into the existing sidtab table by adding a field
to each entry, which when not NULL contains an RCU-protected pointer to
a cache entry containing the cached string. The cache entries are kept
in a linked list sorted according to how recently they were used. On a
cache miss when the cache is full, the least recently used entry is
removed to make space for the new entry.
The patch migrates security_sid_to_context_core() to use the cache (also
a few other functions where it was possible without too much fuss, but
these mostly use the translation for logging in case of error, which is
rare).
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1733259
Cc: Michal Sekletar <msekleta@redhat.com>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
---
Changes in v4:
- use rcu_dereference_protected() instead of rcu_dereference_raw() in
sidtab_sid2str_put()
- fix typo in comment
- remove unnecessary rcu_head_init() call
Changes in v3:
- add rcu@vger.kernel.org and Paul McKenney to Cc for review of the RCU
logic
- add __rcu annotation to the cache entry pointer (sidtab.c now passes
sparse checks with C=1)
Changes in v2:
- skip sidtab_sid2str_put() when in non-task context to prevent
deadlock while avoiding the need to lock the spinlock with
irqsave/-restore (which is slower)
security/selinux/Kconfig | 11 ++
security/selinux/ss/services.c | 138 +++++++++++++++----------
security/selinux/ss/sidtab.c | 179 +++++++++++++++++++++++++++------
security/selinux/ss/sidtab.h | 58 +++++++++--
4 files changed, 294 insertions(+), 92 deletions(-)
diff --git a/security/selinux/Kconfig b/security/selinux/Kconfig
index 5711689deb6a..35fe8878cf1c 100644
--- a/security/selinux/Kconfig
+++ b/security/selinux/Kconfig
@@ -85,3 +85,14 @@ config SECURITY_SELINUX_CHECKREQPROT_VALUE
via /selinux/checkreqprot if authorized by policy.
If you are unsure how to answer this question, answer 0.
+
+config SECURITY_SELINUX_SID2STR_CACHE_SIZE
+ int "NSA SELinux SID to context string translation cache size"
+ depends on SECURITY_SELINUX
+ default 256
+ help
+ This option defines the size of the internal SID -> context string
+ cache, which improves the performance of context to string
+ conversion. Setting this option to 0 disables the cache completely.
+
+ If unsure, keep the default value.
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
index 3a29e7c24ba9..b6dda5261166 100644
--- a/security/selinux/ss/services.c
+++ b/security/selinux/ss/services.c
@@ -91,6 +91,12 @@ static int context_struct_to_string(struct policydb *policydb,
char **scontext,
u32 *scontext_len);
+static int sidtab_entry_to_string(struct policydb *policydb,
+ struct sidtab *sidtab,
+ struct sidtab_entry *entry,
+ char **scontext,
+ u32 *scontext_len);
+
static void context_struct_compute_av(struct policydb *policydb,
struct context *scontext,
struct context *tcontext,
@@ -716,20 +722,21 @@ static void context_struct_compute_av(struct policydb *policydb,
}
static int security_validtrans_handle_fail(struct selinux_state *state,
- struct context *ocontext,
- struct context *ncontext,
- struct context *tcontext,
+ struct sidtab_entry *oentry,
+ struct sidtab_entry *nentry,
+ struct sidtab_entry *tentry,
u16 tclass)
{
struct policydb *p = &state->ss->policydb;
+ struct sidtab *sidtab = state->ss->sidtab;
char *o = NULL, *n = NULL, *t = NULL;
u32 olen, nlen, tlen;
- if (context_struct_to_string(p, ocontext, &o, &olen))
+ if (sidtab_entry_to_string(p, sidtab, oentry, &o, &olen))
goto out;
- if (context_struct_to_string(p, ncontext, &n, &nlen))
+ if (sidtab_entry_to_string(p, sidtab, nentry, &n, &nlen))
goto out;
- if (context_struct_to_string(p, tcontext, &t, &tlen))
+ if (sidtab_entry_to_string(p, sidtab, tentry, &t, &tlen))
goto out;
audit_log(audit_context(), GFP_ATOMIC, AUDIT_SELINUX_ERR,
"op=security_validate_transition seresult=denied"
@@ -751,9 +758,9 @@ static int security_compute_validatetrans(struct selinux_state *state,
{
struct policydb *policydb;
struct sidtab *sidtab;
- struct context *ocontext;
- struct context *ncontext;
- struct context *tcontext;
+ struct sidtab_entry *oentry;
+ struct sidtab_entry *nentry;
+ struct sidtab_entry *tentry;
struct class_datum *tclass_datum;
struct constraint_node *constraint;
u16 tclass;
@@ -779,24 +786,24 @@ static int security_compute_validatetrans(struct selinux_state *state,
}
tclass_datum = policydb->class_val_to_struct[tclass - 1];
- ocontext = sidtab_search(sidtab, oldsid);
- if (!ocontext) {
+ oentry = sidtab_search_entry(sidtab, oldsid);
+ if (!oentry) {
pr_err("SELinux: %s: unrecognized SID %d\n",
__func__, oldsid);
rc = -EINVAL;
goto out;
}
- ncontext = sidtab_search(sidtab, newsid);
- if (!ncontext) {
+ nentry = sidtab_search_entry(sidtab, newsid);
+ if (!nentry) {
pr_err("SELinux: %s: unrecognized SID %d\n",
__func__, newsid);
rc = -EINVAL;
goto out;
}
- tcontext = sidtab_search(sidtab, tasksid);
- if (!tcontext) {
+ tentry = sidtab_search_entry(sidtab, tasksid);
+ if (!tentry) {
pr_err("SELinux: %s: unrecognized SID %d\n",
__func__, tasksid);
rc = -EINVAL;
@@ -805,15 +812,16 @@ static int security_compute_validatetrans(struct selinux_state *state,
constraint = tclass_datum->validatetrans;
while (constraint) {
- if (!constraint_expr_eval(policydb, ocontext, ncontext,
- tcontext, constraint->expr)) {
+ if (!constraint_expr_eval(policydb, &oentry->context,
+ &nentry->context, &tentry->context,
+ constraint->expr)) {
if (user)
rc = -EPERM;
else
rc = security_validtrans_handle_fail(state,
- ocontext,
- ncontext,
- tcontext,
+ oentry,
+ nentry,
+ tentry,
tclass);
goto out;
}
@@ -855,7 +863,7 @@ int security_bounded_transition(struct selinux_state *state,
{
struct policydb *policydb;
struct sidtab *sidtab;
- struct context *old_context, *new_context;
+ struct sidtab_entry *old_entry, *new_entry;
struct type_datum *type;
int index;
int rc;
@@ -869,16 +877,16 @@ int security_bounded_transition(struct selinux_state *state,
sidtab = state->ss->sidtab;
rc = -EINVAL;
- old_context = sidtab_search(sidtab, old_sid);
- if (!old_context) {
+ old_entry = sidtab_search_entry(sidtab, old_sid);
+ if (!old_entry) {
pr_err("SELinux: %s: unrecognized SID %u\n",
__func__, old_sid);
goto out;
}
rc = -EINVAL;
- new_context = sidtab_search(sidtab, new_sid);
- if (!new_context) {
+ new_entry = sidtab_search_entry(sidtab, new_sid);
+ if (!new_entry) {
pr_err("SELinux: %s: unrecognized SID %u\n",
__func__, new_sid);
goto out;
@@ -886,10 +894,10 @@ int security_bounded_transition(struct selinux_state *state,
rc = 0;
/* type/domain unchanged */
- if (old_context->type == new_context->type)
+ if (old_entry->context.type == new_entry->context.type)
goto out;
- index = new_context->type;
+ index = new_entry->context.type;
while (true) {
type = policydb->type_val_to_struct[index - 1];
BUG_ON(!type);
@@ -901,7 +909,7 @@ int security_bounded_transition(struct selinux_state *state,
/* @newsid is bounded by @oldsid */
rc = 0;
- if (type->bounds == old_context->type)
+ if (type->bounds == old_entry->context.type)
break;
index = type->bounds;
@@ -912,10 +920,10 @@ int security_bounded_transition(struct selinux_state *state,
char *new_name = NULL;
u32 length;
- if (!context_struct_to_string(policydb, old_context,
- &old_name, &length) &&
- !context_struct_to_string(policydb, new_context,
- &new_name, &length)) {
+ if (!sidtab_entry_to_string(policydb, sidtab, old_entry,
+ &old_name, &length) &&
+ !sidtab_entry_to_string(policydb, sidtab, new_entry,
+ &new_name, &length)) {
audit_log(audit_context(),
GFP_ATOMIC, AUDIT_SELINUX_ERR,
"op=security_bounded_transition "
@@ -1255,6 +1263,23 @@ static int context_struct_to_string(struct policydb *p,
return 0;
}
+static int sidtab_entry_to_string(struct policydb *p,
+ struct sidtab *sidtab,
+ struct sidtab_entry *entry,
+ char **scontext, u32 *scontext_len)
+{
+ int rc = sidtab_sid2str_get(sidtab, entry, scontext, scontext_len);
+
+ if (rc != -ENOENT)
+ return rc;
+
+ rc = context_struct_to_string(p, &entry->context, scontext,
+ scontext_len);
+ if (!rc && scontext)
+ sidtab_sid2str_put(sidtab, entry, *scontext, *scontext_len);
+ return rc;
+}
+
#include "initial_sid_to_string.h"
const char *security_get_initial_sid_context(u32 sid)
@@ -1271,7 +1296,7 @@ static int security_sid_to_context_core(struct selinux_state *state,
{
struct policydb *policydb;
struct sidtab *sidtab;
- struct context *context;
+ struct sidtab_entry *entry;
int rc = 0;
if (scontext)
@@ -1302,21 +1327,23 @@ static int security_sid_to_context_core(struct selinux_state *state,
read_lock(&state->ss->policy_rwlock);
policydb = &state->ss->policydb;
sidtab = state->ss->sidtab;
+
if (force)
- context = sidtab_search_force(sidtab, sid);
+ entry = sidtab_search_entry_force(sidtab, sid);
else
- context = sidtab_search(sidtab, sid);
- if (!context) {
+ entry = sidtab_search_entry(sidtab, sid);
+ if (!entry) {
pr_err("SELinux: %s: unrecognized SID %d\n",
__func__, sid);
rc = -EINVAL;
goto out_unlock;
}
- if (only_invalid && !context->len)
- rc = 0;
- else
- rc = context_struct_to_string(policydb, context, scontext,
- scontext_len);
+ if (only_invalid && !entry->context.len)
+ goto out_unlock;
+
+ rc = sidtab_entry_to_string(policydb, sidtab, entry, scontext,
+ scontext_len);
+
out_unlock:
read_unlock(&state->ss->policy_rwlock);
out:
@@ -1574,19 +1601,20 @@ int security_context_to_sid_force(struct selinux_state *state,
static int compute_sid_handle_invalid_context(
struct selinux_state *state,
- struct context *scontext,
- struct context *tcontext,
+ struct sidtab_entry *sentry,
+ struct sidtab_entry *tentry,
u16 tclass,
struct context *newcontext)
{
struct policydb *policydb = &state->ss->policydb;
+ struct sidtab *sidtab = state->ss->sidtab;
char *s = NULL, *t = NULL, *n = NULL;
u32 slen, tlen, nlen;
struct audit_buffer *ab;
- if (context_struct_to_string(policydb, scontext, &s, &slen))
+ if (sidtab_entry_to_string(policydb, sidtab, sentry, &s, &slen))
goto out;
- if (context_struct_to_string(policydb, tcontext, &t, &tlen))
+ if (sidtab_entry_to_string(policydb, sidtab, tentry, &t, &tlen))
goto out;
if (context_struct_to_string(policydb, newcontext, &n, &nlen))
goto out;
@@ -1645,7 +1673,8 @@ static int security_compute_sid(struct selinux_state *state,
struct policydb *policydb;
struct sidtab *sidtab;
struct class_datum *cladatum = NULL;
- struct context *scontext = NULL, *tcontext = NULL, newcontext;
+ struct context *scontext, *tcontext, newcontext;
+ struct sidtab_entry *sentry, *tentry;
struct role_trans *roletr = NULL;
struct avtab_key avkey;
struct avtab_datum *avdatum;
@@ -1682,21 +1711,24 @@ static int security_compute_sid(struct selinux_state *state,
policydb = &state->ss->policydb;
sidtab = state->ss->sidtab;
- scontext = sidtab_search(sidtab, ssid);
- if (!scontext) {
+ sentry = sidtab_search_entry(sidtab, ssid);
+ if (!sentry) {
pr_err("SELinux: %s: unrecognized SID %d\n",
__func__, ssid);
rc = -EINVAL;
goto out_unlock;
}
- tcontext = sidtab_search(sidtab, tsid);
- if (!tcontext) {
+ tentry = sidtab_search_entry(sidtab, tsid);
+ if (!tentry) {
pr_err("SELinux: %s: unrecognized SID %d\n",
__func__, tsid);
rc = -EINVAL;
goto out_unlock;
}
+ scontext = &sentry->context;
+ tcontext = &tentry->context;
+
if (tclass && tclass <= policydb->p_classes.nprim)
cladatum = policydb->class_val_to_struct[tclass - 1];
@@ -1797,10 +1829,8 @@ static int security_compute_sid(struct selinux_state *state,
/* Check the validity of the context. */
if (!policydb_context_isvalid(policydb, &newcontext)) {
- rc = compute_sid_handle_invalid_context(state, scontext,
- tcontext,
- tclass,
- &newcontext);
+ rc = compute_sid_handle_invalid_context(state, sentry, tentry,
+ tclass, &newcontext);
if (rc)
goto out_unlock;
}
diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
index 7d49994e8d5f..6d6ce1c43b49 100644
--- a/security/selinux/ss/sidtab.c
+++ b/security/selinux/ss/sidtab.c
@@ -9,6 +9,8 @@
*/
#include <linux/errno.h>
#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/rcupdate.h>
#include <linux/slab.h>
#include <linux/sched.h>
#include <linux/spinlock.h>
@@ -17,6 +19,14 @@
#include "security.h"
#include "sidtab.h"
+struct sidtab_str_cache {
+ struct rcu_head rcu_member;
+ struct list_head lru_member;
+ struct sidtab_entry *parent;
+ u32 len;
+ char str[];
+};
+
int sidtab_init(struct sidtab *s)
{
u32 i;
@@ -34,24 +44,33 @@ int sidtab_init(struct sidtab *s)
s->convert = NULL;
spin_lock_init(&s->lock);
+
+#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
+ s->cache_free_slots = CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE;
+ INIT_LIST_HEAD(&s->cache_lru_list);
+ spin_lock_init(&s->cache_lock);
+#endif
return 0;
}
int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context)
{
- struct sidtab_isid_entry *entry;
+ struct sidtab_isid_entry *isid;
int rc;
if (sid == 0 || sid > SECINITSID_NUM)
return -EINVAL;
- entry = &s->isids[sid - 1];
+ isid = &s->isids[sid - 1];
- rc = context_cpy(&entry->context, context);
+ rc = context_cpy(&isid->entry.context, context);
if (rc)
return rc;
- entry->set = 1;
+#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
+ isid->entry.cache = NULL;
+#endif
+ isid->set = 1;
return 0;
}
@@ -88,7 +107,8 @@ static int sidtab_alloc_roots(struct sidtab *s, u32 level)
return 0;
}
-static struct context *sidtab_do_lookup(struct sidtab *s, u32 index, int alloc)
+static struct sidtab_entry *sidtab_do_lookup(struct sidtab *s, u32 index,
+ int alloc)
{
union sidtab_entry_inner *entry;
u32 level, capacity_shift, leaf_index = index / SIDTAB_LEAF_ENTRIES;
@@ -125,10 +145,16 @@ static struct context *sidtab_do_lookup(struct sidtab *s, u32 index, int alloc)
if (!entry->ptr_leaf)
return NULL;
}
- return &entry->ptr_leaf->entries[index % SIDTAB_LEAF_ENTRIES].context;
+ return &entry->ptr_leaf->entries[index % SIDTAB_LEAF_ENTRIES];
+}
+
+/* use when you know that there is enough entries */
+static struct context *sidtab_lookup_unsafe(struct sidtab *s, u32 index)
+{
+ return &sidtab_do_lookup(s, index, 0)->context;
}
-static struct context *sidtab_lookup(struct sidtab *s, u32 index)
+static struct sidtab_entry *sidtab_lookup(struct sidtab *s, u32 index)
{
/* read entries only after reading count */
u32 count = smp_load_acquire(&s->count);
@@ -139,33 +165,34 @@ static struct context *sidtab_lookup(struct sidtab *s, u32 index)
return sidtab_do_lookup(s, index, 0);
}
-static struct context *sidtab_lookup_initial(struct sidtab *s, u32 sid)
+static struct sidtab_entry *sidtab_lookup_initial(struct sidtab *s, u32 sid)
{
- return s->isids[sid - 1].set ? &s->isids[sid - 1].context : NULL;
+ return s->isids[sid - 1].set ? &s->isids[sid - 1].entry : NULL;
}
-static struct context *sidtab_search_core(struct sidtab *s, u32 sid, int force)
+static struct sidtab_entry *sidtab_search_core(struct sidtab *s, u32 sid,
+ int force)
{
- struct context *context;
-
if (sid != 0) {
+ struct sidtab_entry *entry;
+
if (sid > SECINITSID_NUM)
- context = sidtab_lookup(s, sid - (SECINITSID_NUM + 1));
+ entry = sidtab_lookup(s, sid - (SECINITSID_NUM + 1));
else
- context = sidtab_lookup_initial(s, sid);
- if (context && (!context->len || force))
- return context;
+ entry = sidtab_lookup_initial(s, sid);
+ if (entry && (!entry->context.len || force))
+ return entry;
}
return sidtab_lookup_initial(s, SECINITSID_UNLABELED);
}
-struct context *sidtab_search(struct sidtab *s, u32 sid)
+struct sidtab_entry *sidtab_search_entry(struct sidtab *s, u32 sid)
{
return sidtab_search_core(s, sid, 0);
}
-struct context *sidtab_search_force(struct sidtab *s, u32 sid)
+struct sidtab_entry *sidtab_search_entry_force(struct sidtab *s, u32 sid)
{
return sidtab_search_core(s, sid, 1);
}
@@ -230,7 +257,7 @@ static int sidtab_rcache_search(struct sidtab *s, struct context *context,
if (v >= SIDTAB_MAX)
continue;
- if (context_cmp(sidtab_do_lookup(s, v, 0), context)) {
+ if (context_cmp(sidtab_lookup_unsafe(s, v), context)) {
sidtab_rcache_update(s, v, i);
*index = v;
return 0;
@@ -245,7 +272,7 @@ static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
unsigned long flags;
u32 count, count_locked, level, pos;
struct sidtab_convert_params *convert;
- struct context *dst, *dst_convert;
+ struct sidtab_entry *dst, *dst_convert;
int rc;
rc = sidtab_rcache_search(s, context, index);
@@ -273,7 +300,7 @@ static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
/* if count has changed before we acquired the lock, then catch up */
while (count < count_locked) {
- if (context_cmp(sidtab_do_lookup(s, count, 0), context)) {
+ if (context_cmp(sidtab_lookup_unsafe(s, count), context)) {
sidtab_rcache_push(s, count);
*index = count;
rc = 0;
@@ -293,7 +320,7 @@ static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
if (!dst)
goto out_unlock;
- rc = context_cpy(dst, context);
+ rc = context_cpy(&dst->context, context);
if (rc)
goto out_unlock;
@@ -305,13 +332,14 @@ static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
rc = -ENOMEM;
dst_convert = sidtab_do_lookup(convert->target, count, 1);
if (!dst_convert) {
- context_destroy(dst);
+ context_destroy(&dst->context);
goto out_unlock;
}
- rc = convert->func(context, dst_convert, convert->args);
+ rc = convert->func(context, &dst_convert->context,
+ convert->args);
if (rc) {
- context_destroy(dst);
+ context_destroy(&dst->context);
goto out_unlock;
}
@@ -341,9 +369,9 @@ int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid)
u32 i;
for (i = 0; i < SECINITSID_NUM; i++) {
- struct sidtab_isid_entry *entry = &s->isids[i];
+ struct sidtab_isid_entry *isid = &s->isids[i];
- if (entry->set && context_cmp(context, &entry->context)) {
+ if (isid->set && context_cmp(context, &isid->entry.context)) {
*sid = i + 1;
return 0;
}
@@ -453,6 +481,14 @@ int sidtab_convert(struct sidtab *s, struct sidtab_convert_params *params)
return rc;
}
+static void sidtab_destroy_entry(struct sidtab_entry *entry)
+{
+ context_destroy(&entry->context);
+#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
+ kfree(rcu_dereference_raw(entry->cache));
+#endif
+}
+
static void sidtab_destroy_tree(union sidtab_entry_inner entry, u32 level)
{
u32 i;
@@ -473,7 +509,7 @@ static void sidtab_destroy_tree(union sidtab_entry_inner entry, u32 level)
return;
for (i = 0; i < SIDTAB_LEAF_ENTRIES; i++)
- context_destroy(&node->entries[i].context);
+ sidtab_destroy_entry(&node->entries[i]);
kfree(node);
}
}
@@ -484,7 +520,7 @@ void sidtab_destroy(struct sidtab *s)
for (i = 0; i < SECINITSID_NUM; i++)
if (s->isids[i].set)
- context_destroy(&s->isids[i].context);
+ sidtab_destroy_entry(&s->isids[i].entry);
level = SIDTAB_MAX_LEVEL;
while (level && !s->roots[level].ptr_inner)
@@ -492,3 +528,88 @@ void sidtab_destroy(struct sidtab *s)
sidtab_destroy_tree(s->roots[level], level);
}
+
+#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
+
+void sidtab_sid2str_put(struct sidtab *s, struct sidtab_entry *entry,
+ const char *str, u32 str_len)
+{
+ struct sidtab_str_cache *cache, *victim;
+
+ /* do not cache invalid contexts */
+ if (entry->context.len)
+ return;
+
+ /*
+ * Skip the put operation when in non-task context to avoid the need
+ * to disable interrupts while holding s->cache_lock.
+ */
+ if (!in_task())
+ return;
+
+ spin_lock(&s->cache_lock);
+
+ cache = rcu_dereference_protected(entry->cache,
+ lockdep_is_held(&s->cache_lock));
+ if (cache) {
+ /* entry in cache - just bump to the head of LRU list */
+ list_move(&cache->lru_member, &s->cache_lru_list);
+ goto out_unlock;
+ }
+
+ cache = kmalloc(sizeof(struct sidtab_str_cache) + str_len, GFP_ATOMIC);
+ if (!cache)
+ goto out_unlock;
+
+ if (s->cache_free_slots == 0) {
+ /* pop a cache entry from the tail and free it */
+ victim = container_of(s->cache_lru_list.prev,
+ struct sidtab_str_cache, lru_member);
+ list_del(&victim->lru_member);
+ kfree_rcu(victim, rcu_member);
+ rcu_assign_pointer(victim->parent->cache, NULL);
+ } else {
+ s->cache_free_slots--;
+ }
+ cache->parent = entry;
+ cache->len = str_len;
+ memcpy(cache->str, str, str_len);
+ list_add(&cache->lru_member, &s->cache_lru_list);
+
+ rcu_assign_pointer(entry->cache, cache);
+
+out_unlock:
+ spin_unlock(&s->cache_lock);
+}
+
+int sidtab_sid2str_get(struct sidtab *s, struct sidtab_entry *entry,
+ char **out, u32 *out_len)
+{
+ struct sidtab_str_cache *cache;
+ int rc = 0;
+
+ if (entry->context.len)
+ return -ENOENT; /* do not cache invalid contexts */
+
+ rcu_read_lock();
+
+ cache = rcu_dereference(entry->cache);
+ if (!cache) {
+ rc = -ENOENT;
+ } else {
+ *out_len = cache->len;
+ if (out) {
+ *out = kmemdup(cache->str, cache->len, GFP_ATOMIC);
+ if (!*out)
+ rc = -ENOMEM;
+ }
+ }
+
+ rcu_read_unlock();
+
+ if (!rc && out)
+ sidtab_sid2str_put(s, entry, *out, *out_len);
+ return rc;
+}
+
+#endif /* CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0 */
diff --git a/security/selinux/ss/sidtab.h b/security/selinux/ss/sidtab.h
index 1f4763141aa1..5fe67a0c307b 100644
--- a/security/selinux/ss/sidtab.h
+++ b/security/selinux/ss/sidtab.h
@@ -16,13 +16,13 @@
#include "context.h"
-struct sidtab_entry_leaf {
+struct sidtab_entry {
struct context context;
+#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
+ struct sidtab_str_cache __rcu *cache;
+#endif
};
-struct sidtab_node_inner;
-struct sidtab_node_leaf;
-
union sidtab_entry_inner {
struct sidtab_node_inner *ptr_inner;
struct sidtab_node_leaf *ptr_leaf;
@@ -38,7 +38,7 @@ union sidtab_entry_inner {
(SIDTAB_NODE_ALLOC_SHIFT - size_to_shift(sizeof(union sidtab_entry_inner)))
#define SIDTAB_INNER_ENTRIES ((size_t)1 << SIDTAB_INNER_SHIFT)
#define SIDTAB_LEAF_ENTRIES \
- (SIDTAB_NODE_ALLOC_SIZE / sizeof(struct sidtab_entry_leaf))
+ (SIDTAB_NODE_ALLOC_SIZE / sizeof(struct sidtab_entry))
#define SIDTAB_MAX_BITS 32
#define SIDTAB_MAX U32_MAX
@@ -48,7 +48,7 @@ union sidtab_entry_inner {
SIDTAB_INNER_SHIFT)
struct sidtab_node_leaf {
- struct sidtab_entry_leaf entries[SIDTAB_LEAF_ENTRIES];
+ struct sidtab_entry entries[SIDTAB_LEAF_ENTRIES];
};
struct sidtab_node_inner {
@@ -57,7 +57,7 @@ struct sidtab_node_inner {
struct sidtab_isid_entry {
int set;
- struct context context;
+ struct sidtab_entry entry;
};
struct sidtab_convert_params {
@@ -83,6 +83,13 @@ struct sidtab {
struct sidtab_convert_params *convert;
spinlock_t lock;
+#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
+ /* SID -> context string cache */
+ u32 cache_free_slots;
+ struct list_head cache_lru_list;
+ spinlock_t cache_lock;
+#endif
+
/* reverse lookup cache - access atomically via {READ|WRITE}_ONCE() */
u32 rcache[SIDTAB_RCACHE_SIZE];
@@ -92,8 +99,22 @@ struct sidtab {
int sidtab_init(struct sidtab *s);
int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context);
-struct context *sidtab_search(struct sidtab *s, u32 sid);
-struct context *sidtab_search_force(struct sidtab *s, u32 sid);
+struct sidtab_entry *sidtab_search_entry(struct sidtab *s, u32 sid);
+struct sidtab_entry *sidtab_search_entry_force(struct sidtab *s, u32 sid);
+
+static inline struct context *sidtab_search(struct sidtab *s, u32 sid)
+{
+ struct sidtab_entry *entry = sidtab_search_entry(s, sid);
+
+ return entry ? &entry->context : NULL;
+}
+
+static inline struct context *sidtab_search_force(struct sidtab *s, u32 sid)
+{
+ struct sidtab_entry *entry = sidtab_search_entry_force(s, sid);
+
+ return entry ? &entry->context : NULL;
+}
int sidtab_convert(struct sidtab *s, struct sidtab_convert_params *params);
@@ -101,6 +122,25 @@ int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid);
void sidtab_destroy(struct sidtab *s);
+#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
+void sidtab_sid2str_put(struct sidtab *s, struct sidtab_entry *entry,
+ const char *str, u32 str_len);
+int sidtab_sid2str_get(struct sidtab *s, struct sidtab_entry *entry,
+ char **out, u32 *out_len);
+#else
+static inline void sidtab_sid2str_put(struct sidtab *s,
+ struct sidtab_entry *entry,
+ const char *str, u32 str_len)
+{
+}
+static inline int sidtab_sid2str_get(struct sidtab *s,
+ struct sidtab_entry *entry,
+ char **out, u32 *out_len)
+{
+ return -ENOENT;
+}
+#endif /* CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0 */
+
#endif /* _SS_SIDTAB_H_ */
--
2.21.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v4] selinux: cache the SID -> context string translation
2019-11-11 15:40 [PATCH v4] selinux: cache the SID -> context string translation Ondrej Mosnacek
@ 2019-11-12 15:23 ` Stephen Smalley
2019-11-13 14:29 ` Paul E. McKenney
2019-11-15 0:42 ` Paul Moore
2 siblings, 0 replies; 6+ messages in thread
From: Stephen Smalley @ 2019-11-12 15:23 UTC (permalink / raw)
To: Ondrej Mosnacek, selinux, Paul Moore
Cc: rcu, Paul E. McKenney, Michal Sekletar
On 11/11/19 10:40 AM, Ondrej Mosnacek wrote:
> Translating a context struct to string can be quite slow, especially if
> the context has a lot of category bits set. This can cause quite
> noticeable performance impact in situations where the translation needs
> to be done repeatedly. A common example is a UNIX datagram socket with
> the SO_PASSSEC option enabled, which is used e.g. by systemd-journald
> when receiving log messages via datagram socket. This scenario can be
> reproduced with:
>
> cat /dev/urandom | base64 | logger &
> timeout 30s perf record -p $(pidof systemd-journald) -a -g
> kill %1
> perf report -g none --pretty raw | grep security_secid_to_secctx
>
> Before the caching introduced by this patch, computing the context
> string (security_secid_to_secctx() function) takes up ~65% of
> systemd-journald's CPU time (assuming a context with 1024 categories
> set and Fedora x86_64 release kernel configs). After this patch
> (assuming near-perfect cache hit ratio) this overhead is reduced to just
> ~2%.
>
> This patch addresses the issue by caching a certain number (compile-time
> configurable) of recently used context strings to speed up repeated
> translations of the same context, while using only a small amount of
> memory.
>
> The cache is integrated into the existing sidtab table by adding a field
> to each entry, which when not NULL contains an RCU-protected pointer to
> a cache entry containing the cached string. The cache entries are kept
> in a linked list sorted according to how recently they were used. On a
> cache miss when the cache is full, the least recently used entry is
> removed to make space for the new entry.
>
> The patch migrates security_sid_to_context_core() to use the cache (also
> a few other functions where it was possible without too much fuss, but
> these mostly use the translation for logging in case of error, which is
> rare).
>
> Link: https://bugzilla.redhat.com/show_bug.cgi?id=1733259
> Cc: Michal Sekletar <msekleta@redhat.com>
> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Tested-by: Stephen Smalley <sds@tycho.nsa.gov>
> ---
>
> Changes in v4:
> - use rcu_dereference_protected() instead of rcu_dereference_raw() in
> sidtab_sid2str_put()
> - fix typo in comment
> - remove unnecessary rcu_head_init() call
>
> Changes in v3:
> - add rcu@vger.kernel.org and Paul McKenney to Cc for review of the RCU
> logic
> - add __rcu annotation to the cache entry pointer (sidtab.c now passes
> sparse checks with C=1)
>
> Changes in v2:
> - skip sidtab_sid2str_put() when in non-task context to prevent
> deadlock while avoiding the need to lock the spinlock with
> irqsave/-restore (which is slower)
>
> security/selinux/Kconfig | 11 ++
> security/selinux/ss/services.c | 138 +++++++++++++++----------
> security/selinux/ss/sidtab.c | 179 +++++++++++++++++++++++++++------
> security/selinux/ss/sidtab.h | 58 +++++++++--
> 4 files changed, 294 insertions(+), 92 deletions(-)
>
> diff --git a/security/selinux/Kconfig b/security/selinux/Kconfig
> index 5711689deb6a..35fe8878cf1c 100644
> --- a/security/selinux/Kconfig
> +++ b/security/selinux/Kconfig
> @@ -85,3 +85,14 @@ config SECURITY_SELINUX_CHECKREQPROT_VALUE
> via /selinux/checkreqprot if authorized by policy.
>
> If you are unsure how to answer this question, answer 0.
> +
> +config SECURITY_SELINUX_SID2STR_CACHE_SIZE
> + int "NSA SELinux SID to context string translation cache size"
> + depends on SECURITY_SELINUX
> + default 256
> + help
> + This option defines the size of the internal SID -> context string
> + cache, which improves the performance of context to string
> + conversion. Setting this option to 0 disables the cache completely.
> +
> + If unsure, keep the default value.
> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
> index 3a29e7c24ba9..b6dda5261166 100644
> --- a/security/selinux/ss/services.c
> +++ b/security/selinux/ss/services.c
> @@ -91,6 +91,12 @@ static int context_struct_to_string(struct policydb *policydb,
> char **scontext,
> u32 *scontext_len);
>
> +static int sidtab_entry_to_string(struct policydb *policydb,
> + struct sidtab *sidtab,
> + struct sidtab_entry *entry,
> + char **scontext,
> + u32 *scontext_len);
> +
> static void context_struct_compute_av(struct policydb *policydb,
> struct context *scontext,
> struct context *tcontext,
> @@ -716,20 +722,21 @@ static void context_struct_compute_av(struct policydb *policydb,
> }
>
> static int security_validtrans_handle_fail(struct selinux_state *state,
> - struct context *ocontext,
> - struct context *ncontext,
> - struct context *tcontext,
> + struct sidtab_entry *oentry,
> + struct sidtab_entry *nentry,
> + struct sidtab_entry *tentry,
> u16 tclass)
> {
> struct policydb *p = &state->ss->policydb;
> + struct sidtab *sidtab = state->ss->sidtab;
> char *o = NULL, *n = NULL, *t = NULL;
> u32 olen, nlen, tlen;
>
> - if (context_struct_to_string(p, ocontext, &o, &olen))
> + if (sidtab_entry_to_string(p, sidtab, oentry, &o, &olen))
> goto out;
> - if (context_struct_to_string(p, ncontext, &n, &nlen))
> + if (sidtab_entry_to_string(p, sidtab, nentry, &n, &nlen))
> goto out;
> - if (context_struct_to_string(p, tcontext, &t, &tlen))
> + if (sidtab_entry_to_string(p, sidtab, tentry, &t, &tlen))
> goto out;
> audit_log(audit_context(), GFP_ATOMIC, AUDIT_SELINUX_ERR,
> "op=security_validate_transition seresult=denied"
> @@ -751,9 +758,9 @@ static int security_compute_validatetrans(struct selinux_state *state,
> {
> struct policydb *policydb;
> struct sidtab *sidtab;
> - struct context *ocontext;
> - struct context *ncontext;
> - struct context *tcontext;
> + struct sidtab_entry *oentry;
> + struct sidtab_entry *nentry;
> + struct sidtab_entry *tentry;
> struct class_datum *tclass_datum;
> struct constraint_node *constraint;
> u16 tclass;
> @@ -779,24 +786,24 @@ static int security_compute_validatetrans(struct selinux_state *state,
> }
> tclass_datum = policydb->class_val_to_struct[tclass - 1];
>
> - ocontext = sidtab_search(sidtab, oldsid);
> - if (!ocontext) {
> + oentry = sidtab_search_entry(sidtab, oldsid);
> + if (!oentry) {
> pr_err("SELinux: %s: unrecognized SID %d\n",
> __func__, oldsid);
> rc = -EINVAL;
> goto out;
> }
>
> - ncontext = sidtab_search(sidtab, newsid);
> - if (!ncontext) {
> + nentry = sidtab_search_entry(sidtab, newsid);
> + if (!nentry) {
> pr_err("SELinux: %s: unrecognized SID %d\n",
> __func__, newsid);
> rc = -EINVAL;
> goto out;
> }
>
> - tcontext = sidtab_search(sidtab, tasksid);
> - if (!tcontext) {
> + tentry = sidtab_search_entry(sidtab, tasksid);
> + if (!tentry) {
> pr_err("SELinux: %s: unrecognized SID %d\n",
> __func__, tasksid);
> rc = -EINVAL;
> @@ -805,15 +812,16 @@ static int security_compute_validatetrans(struct selinux_state *state,
>
> constraint = tclass_datum->validatetrans;
> while (constraint) {
> - if (!constraint_expr_eval(policydb, ocontext, ncontext,
> - tcontext, constraint->expr)) {
> + if (!constraint_expr_eval(policydb, &oentry->context,
> + &nentry->context, &tentry->context,
> + constraint->expr)) {
> if (user)
> rc = -EPERM;
> else
> rc = security_validtrans_handle_fail(state,
> - ocontext,
> - ncontext,
> - tcontext,
> + oentry,
> + nentry,
> + tentry,
> tclass);
> goto out;
> }
> @@ -855,7 +863,7 @@ int security_bounded_transition(struct selinux_state *state,
> {
> struct policydb *policydb;
> struct sidtab *sidtab;
> - struct context *old_context, *new_context;
> + struct sidtab_entry *old_entry, *new_entry;
> struct type_datum *type;
> int index;
> int rc;
> @@ -869,16 +877,16 @@ int security_bounded_transition(struct selinux_state *state,
> sidtab = state->ss->sidtab;
>
> rc = -EINVAL;
> - old_context = sidtab_search(sidtab, old_sid);
> - if (!old_context) {
> + old_entry = sidtab_search_entry(sidtab, old_sid);
> + if (!old_entry) {
> pr_err("SELinux: %s: unrecognized SID %u\n",
> __func__, old_sid);
> goto out;
> }
>
> rc = -EINVAL;
> - new_context = sidtab_search(sidtab, new_sid);
> - if (!new_context) {
> + new_entry = sidtab_search_entry(sidtab, new_sid);
> + if (!new_entry) {
> pr_err("SELinux: %s: unrecognized SID %u\n",
> __func__, new_sid);
> goto out;
> @@ -886,10 +894,10 @@ int security_bounded_transition(struct selinux_state *state,
>
> rc = 0;
> /* type/domain unchanged */
> - if (old_context->type == new_context->type)
> + if (old_entry->context.type == new_entry->context.type)
> goto out;
>
> - index = new_context->type;
> + index = new_entry->context.type;
> while (true) {
> type = policydb->type_val_to_struct[index - 1];
> BUG_ON(!type);
> @@ -901,7 +909,7 @@ int security_bounded_transition(struct selinux_state *state,
>
> /* @newsid is bounded by @oldsid */
> rc = 0;
> - if (type->bounds == old_context->type)
> + if (type->bounds == old_entry->context.type)
> break;
>
> index = type->bounds;
> @@ -912,10 +920,10 @@ int security_bounded_transition(struct selinux_state *state,
> char *new_name = NULL;
> u32 length;
>
> - if (!context_struct_to_string(policydb, old_context,
> - &old_name, &length) &&
> - !context_struct_to_string(policydb, new_context,
> - &new_name, &length)) {
> + if (!sidtab_entry_to_string(policydb, sidtab, old_entry,
> + &old_name, &length) &&
> + !sidtab_entry_to_string(policydb, sidtab, new_entry,
> + &new_name, &length)) {
> audit_log(audit_context(),
> GFP_ATOMIC, AUDIT_SELINUX_ERR,
> "op=security_bounded_transition "
> @@ -1255,6 +1263,23 @@ static int context_struct_to_string(struct policydb *p,
> return 0;
> }
>
> +static int sidtab_entry_to_string(struct policydb *p,
> + struct sidtab *sidtab,
> + struct sidtab_entry *entry,
> + char **scontext, u32 *scontext_len)
> +{
> + int rc = sidtab_sid2str_get(sidtab, entry, scontext, scontext_len);
> +
> + if (rc != -ENOENT)
> + return rc;
> +
> + rc = context_struct_to_string(p, &entry->context, scontext,
> + scontext_len);
> + if (!rc && scontext)
> + sidtab_sid2str_put(sidtab, entry, *scontext, *scontext_len);
> + return rc;
> +}
> +
> #include "initial_sid_to_string.h"
>
> const char *security_get_initial_sid_context(u32 sid)
> @@ -1271,7 +1296,7 @@ static int security_sid_to_context_core(struct selinux_state *state,
> {
> struct policydb *policydb;
> struct sidtab *sidtab;
> - struct context *context;
> + struct sidtab_entry *entry;
> int rc = 0;
>
> if (scontext)
> @@ -1302,21 +1327,23 @@ static int security_sid_to_context_core(struct selinux_state *state,
> read_lock(&state->ss->policy_rwlock);
> policydb = &state->ss->policydb;
> sidtab = state->ss->sidtab;
> +
> if (force)
> - context = sidtab_search_force(sidtab, sid);
> + entry = sidtab_search_entry_force(sidtab, sid);
> else
> - context = sidtab_search(sidtab, sid);
> - if (!context) {
> + entry = sidtab_search_entry(sidtab, sid);
> + if (!entry) {
> pr_err("SELinux: %s: unrecognized SID %d\n",
> __func__, sid);
> rc = -EINVAL;
> goto out_unlock;
> }
> - if (only_invalid && !context->len)
> - rc = 0;
> - else
> - rc = context_struct_to_string(policydb, context, scontext,
> - scontext_len);
> + if (only_invalid && !entry->context.len)
> + goto out_unlock;
> +
> + rc = sidtab_entry_to_string(policydb, sidtab, entry, scontext,
> + scontext_len);
> +
> out_unlock:
> read_unlock(&state->ss->policy_rwlock);
> out:
> @@ -1574,19 +1601,20 @@ int security_context_to_sid_force(struct selinux_state *state,
>
> static int compute_sid_handle_invalid_context(
> struct selinux_state *state,
> - struct context *scontext,
> - struct context *tcontext,
> + struct sidtab_entry *sentry,
> + struct sidtab_entry *tentry,
> u16 tclass,
> struct context *newcontext)
> {
> struct policydb *policydb = &state->ss->policydb;
> + struct sidtab *sidtab = state->ss->sidtab;
> char *s = NULL, *t = NULL, *n = NULL;
> u32 slen, tlen, nlen;
> struct audit_buffer *ab;
>
> - if (context_struct_to_string(policydb, scontext, &s, &slen))
> + if (sidtab_entry_to_string(policydb, sidtab, sentry, &s, &slen))
> goto out;
> - if (context_struct_to_string(policydb, tcontext, &t, &tlen))
> + if (sidtab_entry_to_string(policydb, sidtab, tentry, &t, &tlen))
> goto out;
> if (context_struct_to_string(policydb, newcontext, &n, &nlen))
> goto out;
> @@ -1645,7 +1673,8 @@ static int security_compute_sid(struct selinux_state *state,
> struct policydb *policydb;
> struct sidtab *sidtab;
> struct class_datum *cladatum = NULL;
> - struct context *scontext = NULL, *tcontext = NULL, newcontext;
> + struct context *scontext, *tcontext, newcontext;
> + struct sidtab_entry *sentry, *tentry;
> struct role_trans *roletr = NULL;
> struct avtab_key avkey;
> struct avtab_datum *avdatum;
> @@ -1682,21 +1711,24 @@ static int security_compute_sid(struct selinux_state *state,
> policydb = &state->ss->policydb;
> sidtab = state->ss->sidtab;
>
> - scontext = sidtab_search(sidtab, ssid);
> - if (!scontext) {
> + sentry = sidtab_search_entry(sidtab, ssid);
> + if (!sentry) {
> pr_err("SELinux: %s: unrecognized SID %d\n",
> __func__, ssid);
> rc = -EINVAL;
> goto out_unlock;
> }
> - tcontext = sidtab_search(sidtab, tsid);
> - if (!tcontext) {
> + tentry = sidtab_search_entry(sidtab, tsid);
> + if (!tentry) {
> pr_err("SELinux: %s: unrecognized SID %d\n",
> __func__, tsid);
> rc = -EINVAL;
> goto out_unlock;
> }
>
> + scontext = &sentry->context;
> + tcontext = &tentry->context;
> +
> if (tclass && tclass <= policydb->p_classes.nprim)
> cladatum = policydb->class_val_to_struct[tclass - 1];
>
> @@ -1797,10 +1829,8 @@ static int security_compute_sid(struct selinux_state *state,
>
> /* Check the validity of the context. */
> if (!policydb_context_isvalid(policydb, &newcontext)) {
> - rc = compute_sid_handle_invalid_context(state, scontext,
> - tcontext,
> - tclass,
> - &newcontext);
> + rc = compute_sid_handle_invalid_context(state, sentry, tentry,
> + tclass, &newcontext);
> if (rc)
> goto out_unlock;
> }
> diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
> index 7d49994e8d5f..6d6ce1c43b49 100644
> --- a/security/selinux/ss/sidtab.c
> +++ b/security/selinux/ss/sidtab.c
> @@ -9,6 +9,8 @@
> */
> #include <linux/errno.h>
> #include <linux/kernel.h>
> +#include <linux/list.h>
> +#include <linux/rcupdate.h>
> #include <linux/slab.h>
> #include <linux/sched.h>
> #include <linux/spinlock.h>
> @@ -17,6 +19,14 @@
> #include "security.h"
> #include "sidtab.h"
>
> +struct sidtab_str_cache {
> + struct rcu_head rcu_member;
> + struct list_head lru_member;
> + struct sidtab_entry *parent;
> + u32 len;
> + char str[];
> +};
> +
> int sidtab_init(struct sidtab *s)
> {
> u32 i;
> @@ -34,24 +44,33 @@ int sidtab_init(struct sidtab *s)
> s->convert = NULL;
>
> spin_lock_init(&s->lock);
> +
> +#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
> + s->cache_free_slots = CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE;
> + INIT_LIST_HEAD(&s->cache_lru_list);
> + spin_lock_init(&s->cache_lock);
> +#endif
> return 0;
> }
>
> int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context)
> {
> - struct sidtab_isid_entry *entry;
> + struct sidtab_isid_entry *isid;
> int rc;
>
> if (sid == 0 || sid > SECINITSID_NUM)
> return -EINVAL;
>
> - entry = &s->isids[sid - 1];
> + isid = &s->isids[sid - 1];
>
> - rc = context_cpy(&entry->context, context);
> + rc = context_cpy(&isid->entry.context, context);
> if (rc)
> return rc;
>
> - entry->set = 1;
> +#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
> + isid->entry.cache = NULL;
> +#endif
> + isid->set = 1;
> return 0;
> }
>
> @@ -88,7 +107,8 @@ static int sidtab_alloc_roots(struct sidtab *s, u32 level)
> return 0;
> }
>
> -static struct context *sidtab_do_lookup(struct sidtab *s, u32 index, int alloc)
> +static struct sidtab_entry *sidtab_do_lookup(struct sidtab *s, u32 index,
> + int alloc)
> {
> union sidtab_entry_inner *entry;
> u32 level, capacity_shift, leaf_index = index / SIDTAB_LEAF_ENTRIES;
> @@ -125,10 +145,16 @@ static struct context *sidtab_do_lookup(struct sidtab *s, u32 index, int alloc)
> if (!entry->ptr_leaf)
> return NULL;
> }
> - return &entry->ptr_leaf->entries[index % SIDTAB_LEAF_ENTRIES].context;
> + return &entry->ptr_leaf->entries[index % SIDTAB_LEAF_ENTRIES];
> +}
> +
> +/* use when you know that there is enough entries */
> +static struct context *sidtab_lookup_unsafe(struct sidtab *s, u32 index)
> +{
> + return &sidtab_do_lookup(s, index, 0)->context;
> }
>
> -static struct context *sidtab_lookup(struct sidtab *s, u32 index)
> +static struct sidtab_entry *sidtab_lookup(struct sidtab *s, u32 index)
> {
> /* read entries only after reading count */
> u32 count = smp_load_acquire(&s->count);
> @@ -139,33 +165,34 @@ static struct context *sidtab_lookup(struct sidtab *s, u32 index)
> return sidtab_do_lookup(s, index, 0);
> }
>
> -static struct context *sidtab_lookup_initial(struct sidtab *s, u32 sid)
> +static struct sidtab_entry *sidtab_lookup_initial(struct sidtab *s, u32 sid)
> {
> - return s->isids[sid - 1].set ? &s->isids[sid - 1].context : NULL;
> + return s->isids[sid - 1].set ? &s->isids[sid - 1].entry : NULL;
> }
>
> -static struct context *sidtab_search_core(struct sidtab *s, u32 sid, int force)
> +static struct sidtab_entry *sidtab_search_core(struct sidtab *s, u32 sid,
> + int force)
> {
> - struct context *context;
> -
> if (sid != 0) {
> + struct sidtab_entry *entry;
> +
> if (sid > SECINITSID_NUM)
> - context = sidtab_lookup(s, sid - (SECINITSID_NUM + 1));
> + entry = sidtab_lookup(s, sid - (SECINITSID_NUM + 1));
> else
> - context = sidtab_lookup_initial(s, sid);
> - if (context && (!context->len || force))
> - return context;
> + entry = sidtab_lookup_initial(s, sid);
> + if (entry && (!entry->context.len || force))
> + return entry;
> }
>
> return sidtab_lookup_initial(s, SECINITSID_UNLABELED);
> }
>
> -struct context *sidtab_search(struct sidtab *s, u32 sid)
> +struct sidtab_entry *sidtab_search_entry(struct sidtab *s, u32 sid)
> {
> return sidtab_search_core(s, sid, 0);
> }
>
> -struct context *sidtab_search_force(struct sidtab *s, u32 sid)
> +struct sidtab_entry *sidtab_search_entry_force(struct sidtab *s, u32 sid)
> {
> return sidtab_search_core(s, sid, 1);
> }
> @@ -230,7 +257,7 @@ static int sidtab_rcache_search(struct sidtab *s, struct context *context,
> if (v >= SIDTAB_MAX)
> continue;
>
> - if (context_cmp(sidtab_do_lookup(s, v, 0), context)) {
> + if (context_cmp(sidtab_lookup_unsafe(s, v), context)) {
> sidtab_rcache_update(s, v, i);
> *index = v;
> return 0;
> @@ -245,7 +272,7 @@ static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
> unsigned long flags;
> u32 count, count_locked, level, pos;
> struct sidtab_convert_params *convert;
> - struct context *dst, *dst_convert;
> + struct sidtab_entry *dst, *dst_convert;
> int rc;
>
> rc = sidtab_rcache_search(s, context, index);
> @@ -273,7 +300,7 @@ static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
>
> /* if count has changed before we acquired the lock, then catch up */
> while (count < count_locked) {
> - if (context_cmp(sidtab_do_lookup(s, count, 0), context)) {
> + if (context_cmp(sidtab_lookup_unsafe(s, count), context)) {
> sidtab_rcache_push(s, count);
> *index = count;
> rc = 0;
> @@ -293,7 +320,7 @@ static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
> if (!dst)
> goto out_unlock;
>
> - rc = context_cpy(dst, context);
> + rc = context_cpy(&dst->context, context);
> if (rc)
> goto out_unlock;
>
> @@ -305,13 +332,14 @@ static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
> rc = -ENOMEM;
> dst_convert = sidtab_do_lookup(convert->target, count, 1);
> if (!dst_convert) {
> - context_destroy(dst);
> + context_destroy(&dst->context);
> goto out_unlock;
> }
>
> - rc = convert->func(context, dst_convert, convert->args);
> + rc = convert->func(context, &dst_convert->context,
> + convert->args);
> if (rc) {
> - context_destroy(dst);
> + context_destroy(&dst->context);
> goto out_unlock;
> }
>
> @@ -341,9 +369,9 @@ int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid)
> u32 i;
>
> for (i = 0; i < SECINITSID_NUM; i++) {
> - struct sidtab_isid_entry *entry = &s->isids[i];
> + struct sidtab_isid_entry *isid = &s->isids[i];
>
> - if (entry->set && context_cmp(context, &entry->context)) {
> + if (isid->set && context_cmp(context, &isid->entry.context)) {
> *sid = i + 1;
> return 0;
> }
> @@ -453,6 +481,14 @@ int sidtab_convert(struct sidtab *s, struct sidtab_convert_params *params)
> return rc;
> }
>
> +static void sidtab_destroy_entry(struct sidtab_entry *entry)
> +{
> + context_destroy(&entry->context);
> +#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
> + kfree(rcu_dereference_raw(entry->cache));
> +#endif
> +}
> +
> static void sidtab_destroy_tree(union sidtab_entry_inner entry, u32 level)
> {
> u32 i;
> @@ -473,7 +509,7 @@ static void sidtab_destroy_tree(union sidtab_entry_inner entry, u32 level)
> return;
>
> for (i = 0; i < SIDTAB_LEAF_ENTRIES; i++)
> - context_destroy(&node->entries[i].context);
> + sidtab_destroy_entry(&node->entries[i]);
> kfree(node);
> }
> }
> @@ -484,7 +520,7 @@ void sidtab_destroy(struct sidtab *s)
>
> for (i = 0; i < SECINITSID_NUM; i++)
> if (s->isids[i].set)
> - context_destroy(&s->isids[i].context);
> + sidtab_destroy_entry(&s->isids[i].entry);
>
> level = SIDTAB_MAX_LEVEL;
> while (level && !s->roots[level].ptr_inner)
> @@ -492,3 +528,88 @@ void sidtab_destroy(struct sidtab *s)
>
> sidtab_destroy_tree(s->roots[level], level);
> }
> +
> +#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
> +
> +void sidtab_sid2str_put(struct sidtab *s, struct sidtab_entry *entry,
> + const char *str, u32 str_len)
> +{
> + struct sidtab_str_cache *cache, *victim;
> +
> + /* do not cache invalid contexts */
> + if (entry->context.len)
> + return;
> +
> + /*
> + * Skip the put operation when in non-task context to avoid the need
> + * to disable interrupts while holding s->cache_lock.
> + */
> + if (!in_task())
> + return;
> +
> + spin_lock(&s->cache_lock);
> +
> + cache = rcu_dereference_protected(entry->cache,
> + lockdep_is_held(&s->cache_lock));
> + if (cache) {
> + /* entry in cache - just bump to the head of LRU list */
> + list_move(&cache->lru_member, &s->cache_lru_list);
> + goto out_unlock;
> + }
> +
> + cache = kmalloc(sizeof(struct sidtab_str_cache) + str_len, GFP_ATOMIC);
> + if (!cache)
> + goto out_unlock;
> +
> + if (s->cache_free_slots == 0) {
> + /* pop a cache entry from the tail and free it */
> + victim = container_of(s->cache_lru_list.prev,
> + struct sidtab_str_cache, lru_member);
> + list_del(&victim->lru_member);
> + kfree_rcu(victim, rcu_member);
> + rcu_assign_pointer(victim->parent->cache, NULL);
> + } else {
> + s->cache_free_slots--;
> + }
> + cache->parent = entry;
> + cache->len = str_len;
> + memcpy(cache->str, str, str_len);
> + list_add(&cache->lru_member, &s->cache_lru_list);
> +
> + rcu_assign_pointer(entry->cache, cache);
> +
> +out_unlock:
> + spin_unlock(&s->cache_lock);
> +}
> +
> +int sidtab_sid2str_get(struct sidtab *s, struct sidtab_entry *entry,
> + char **out, u32 *out_len)
> +{
> + struct sidtab_str_cache *cache;
> + int rc = 0;
> +
> + if (entry->context.len)
> + return -ENOENT; /* do not cache invalid contexts */
> +
> + rcu_read_lock();
> +
> + cache = rcu_dereference(entry->cache);
> + if (!cache) {
> + rc = -ENOENT;
> + } else {
> + *out_len = cache->len;
> + if (out) {
> + *out = kmemdup(cache->str, cache->len, GFP_ATOMIC);
> + if (!*out)
> + rc = -ENOMEM;
> + }
> + }
> +
> + rcu_read_unlock();
> +
> + if (!rc && out)
> + sidtab_sid2str_put(s, entry, *out, *out_len);
> + return rc;
> +}
> +
> +#endif /* CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0 */
> diff --git a/security/selinux/ss/sidtab.h b/security/selinux/ss/sidtab.h
> index 1f4763141aa1..5fe67a0c307b 100644
> --- a/security/selinux/ss/sidtab.h
> +++ b/security/selinux/ss/sidtab.h
> @@ -16,13 +16,13 @@
>
> #include "context.h"
>
> -struct sidtab_entry_leaf {
> +struct sidtab_entry {
> struct context context;
> +#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
> + struct sidtab_str_cache __rcu *cache;
> +#endif
> };
>
> -struct sidtab_node_inner;
> -struct sidtab_node_leaf;
> -
> union sidtab_entry_inner {
> struct sidtab_node_inner *ptr_inner;
> struct sidtab_node_leaf *ptr_leaf;
> @@ -38,7 +38,7 @@ union sidtab_entry_inner {
> (SIDTAB_NODE_ALLOC_SHIFT - size_to_shift(sizeof(union sidtab_entry_inner)))
> #define SIDTAB_INNER_ENTRIES ((size_t)1 << SIDTAB_INNER_SHIFT)
> #define SIDTAB_LEAF_ENTRIES \
> - (SIDTAB_NODE_ALLOC_SIZE / sizeof(struct sidtab_entry_leaf))
> + (SIDTAB_NODE_ALLOC_SIZE / sizeof(struct sidtab_entry))
>
> #define SIDTAB_MAX_BITS 32
> #define SIDTAB_MAX U32_MAX
> @@ -48,7 +48,7 @@ union sidtab_entry_inner {
> SIDTAB_INNER_SHIFT)
>
> struct sidtab_node_leaf {
> - struct sidtab_entry_leaf entries[SIDTAB_LEAF_ENTRIES];
> + struct sidtab_entry entries[SIDTAB_LEAF_ENTRIES];
> };
>
> struct sidtab_node_inner {
> @@ -57,7 +57,7 @@ struct sidtab_node_inner {
>
> struct sidtab_isid_entry {
> int set;
> - struct context context;
> + struct sidtab_entry entry;
> };
>
> struct sidtab_convert_params {
> @@ -83,6 +83,13 @@ struct sidtab {
> struct sidtab_convert_params *convert;
> spinlock_t lock;
>
> +#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
> + /* SID -> context string cache */
> + u32 cache_free_slots;
> + struct list_head cache_lru_list;
> + spinlock_t cache_lock;
> +#endif
> +
> /* reverse lookup cache - access atomically via {READ|WRITE}_ONCE() */
> u32 rcache[SIDTAB_RCACHE_SIZE];
>
> @@ -92,8 +99,22 @@ struct sidtab {
>
> int sidtab_init(struct sidtab *s);
> int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context);
> -struct context *sidtab_search(struct sidtab *s, u32 sid);
> -struct context *sidtab_search_force(struct sidtab *s, u32 sid);
> +struct sidtab_entry *sidtab_search_entry(struct sidtab *s, u32 sid);
> +struct sidtab_entry *sidtab_search_entry_force(struct sidtab *s, u32 sid);
> +
> +static inline struct context *sidtab_search(struct sidtab *s, u32 sid)
> +{
> + struct sidtab_entry *entry = sidtab_search_entry(s, sid);
> +
> + return entry ? &entry->context : NULL;
> +}
> +
> +static inline struct context *sidtab_search_force(struct sidtab *s, u32 sid)
> +{
> + struct sidtab_entry *entry = sidtab_search_entry_force(s, sid);
> +
> + return entry ? &entry->context : NULL;
> +}
>
> int sidtab_convert(struct sidtab *s, struct sidtab_convert_params *params);
>
> @@ -101,6 +122,25 @@ int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid);
>
> void sidtab_destroy(struct sidtab *s);
>
> +#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
> +void sidtab_sid2str_put(struct sidtab *s, struct sidtab_entry *entry,
> + const char *str, u32 str_len);
> +int sidtab_sid2str_get(struct sidtab *s, struct sidtab_entry *entry,
> + char **out, u32 *out_len);
> +#else
> +static inline void sidtab_sid2str_put(struct sidtab *s,
> + struct sidtab_entry *entry,
> + const char *str, u32 str_len)
> +{
> +}
> +static inline int sidtab_sid2str_get(struct sidtab *s,
> + struct sidtab_entry *entry,
> + char **out, u32 *out_len)
> +{
> + return -ENOENT;
> +}
> +#endif /* CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0 */
> +
> #endif /* _SS_SIDTAB_H_ */
>
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4] selinux: cache the SID -> context string translation
2019-11-11 15:40 [PATCH v4] selinux: cache the SID -> context string translation Ondrej Mosnacek
2019-11-12 15:23 ` Stephen Smalley
@ 2019-11-13 14:29 ` Paul E. McKenney
2019-11-15 0:42 ` Paul Moore
2 siblings, 0 replies; 6+ messages in thread
From: Paul E. McKenney @ 2019-11-13 14:29 UTC (permalink / raw)
To: Ondrej Mosnacek
Cc: selinux, Paul Moore, Stephen Smalley, rcu, Michal Sekletar
On Mon, Nov 11, 2019 at 04:40:04PM +0100, Ondrej Mosnacek wrote:
> Translating a context struct to string can be quite slow, especially if
> the context has a lot of category bits set. This can cause quite
> noticeable performance impact in situations where the translation needs
> to be done repeatedly. A common example is a UNIX datagram socket with
> the SO_PASSSEC option enabled, which is used e.g. by systemd-journald
> when receiving log messages via datagram socket. This scenario can be
> reproduced with:
>
> cat /dev/urandom | base64 | logger &
> timeout 30s perf record -p $(pidof systemd-journald) -a -g
> kill %1
> perf report -g none --pretty raw | grep security_secid_to_secctx
>
> Before the caching introduced by this patch, computing the context
> string (security_secid_to_secctx() function) takes up ~65% of
> systemd-journald's CPU time (assuming a context with 1024 categories
> set and Fedora x86_64 release kernel configs). After this patch
> (assuming near-perfect cache hit ratio) this overhead is reduced to just
> ~2%.
>
> This patch addresses the issue by caching a certain number (compile-time
> configurable) of recently used context strings to speed up repeated
> translations of the same context, while using only a small amount of
> memory.
>
> The cache is integrated into the existing sidtab table by adding a field
> to each entry, which when not NULL contains an RCU-protected pointer to
> a cache entry containing the cached string. The cache entries are kept
> in a linked list sorted according to how recently they were used. On a
> cache miss when the cache is full, the least recently used entry is
> removed to make space for the new entry.
>
> The patch migrates security_sid_to_context_core() to use the cache (also
> a few other functions where it was possible without too much fuss, but
> these mostly use the translation for logging in case of error, which is
> rare).
>
> Link: https://bugzilla.redhat.com/show_bug.cgi?id=1733259
> Cc: Michal Sekletar <msekleta@redhat.com>
> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
> ---
>
> Changes in v4:
> - use rcu_dereference_protected() instead of rcu_dereference_raw() in
> sidtab_sid2str_put()
> - fix typo in comment
> - remove unnecessary rcu_head_init() call
>
> Changes in v3:
> - add rcu@vger.kernel.org and Paul McKenney to Cc for review of the RCU
> logic
> - add __rcu annotation to the cache entry pointer (sidtab.c now passes
> sparse checks with C=1)
>
> Changes in v2:
> - skip sidtab_sid2str_put() when in non-task context to prevent
> deadlock while avoiding the need to lock the spinlock with
> irqsave/-restore (which is slower)
>
> security/selinux/Kconfig | 11 ++
> security/selinux/ss/services.c | 138 +++++++++++++++----------
> security/selinux/ss/sidtab.c | 179 +++++++++++++++++++++++++++------
> security/selinux/ss/sidtab.h | 58 +++++++++--
> 4 files changed, 294 insertions(+), 92 deletions(-)
>
> diff --git a/security/selinux/Kconfig b/security/selinux/Kconfig
> index 5711689deb6a..35fe8878cf1c 100644
> --- a/security/selinux/Kconfig
> +++ b/security/selinux/Kconfig
> @@ -85,3 +85,14 @@ config SECURITY_SELINUX_CHECKREQPROT_VALUE
> via /selinux/checkreqprot if authorized by policy.
>
> If you are unsure how to answer this question, answer 0.
> +
> +config SECURITY_SELINUX_SID2STR_CACHE_SIZE
> + int "NSA SELinux SID to context string translation cache size"
> + depends on SECURITY_SELINUX
> + default 256
> + help
> + This option defines the size of the internal SID -> context string
> + cache, which improves the performance of context to string
> + conversion. Setting this option to 0 disables the cache completely.
> +
> + If unsure, keep the default value.
> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
> index 3a29e7c24ba9..b6dda5261166 100644
> --- a/security/selinux/ss/services.c
> +++ b/security/selinux/ss/services.c
> @@ -91,6 +91,12 @@ static int context_struct_to_string(struct policydb *policydb,
> char **scontext,
> u32 *scontext_len);
>
> +static int sidtab_entry_to_string(struct policydb *policydb,
> + struct sidtab *sidtab,
> + struct sidtab_entry *entry,
> + char **scontext,
> + u32 *scontext_len);
> +
> static void context_struct_compute_av(struct policydb *policydb,
> struct context *scontext,
> struct context *tcontext,
> @@ -716,20 +722,21 @@ static void context_struct_compute_av(struct policydb *policydb,
> }
>
> static int security_validtrans_handle_fail(struct selinux_state *state,
> - struct context *ocontext,
> - struct context *ncontext,
> - struct context *tcontext,
> + struct sidtab_entry *oentry,
> + struct sidtab_entry *nentry,
> + struct sidtab_entry *tentry,
> u16 tclass)
> {
> struct policydb *p = &state->ss->policydb;
> + struct sidtab *sidtab = state->ss->sidtab;
> char *o = NULL, *n = NULL, *t = NULL;
> u32 olen, nlen, tlen;
>
> - if (context_struct_to_string(p, ocontext, &o, &olen))
> + if (sidtab_entry_to_string(p, sidtab, oentry, &o, &olen))
> goto out;
> - if (context_struct_to_string(p, ncontext, &n, &nlen))
> + if (sidtab_entry_to_string(p, sidtab, nentry, &n, &nlen))
> goto out;
> - if (context_struct_to_string(p, tcontext, &t, &tlen))
> + if (sidtab_entry_to_string(p, sidtab, tentry, &t, &tlen))
> goto out;
> audit_log(audit_context(), GFP_ATOMIC, AUDIT_SELINUX_ERR,
> "op=security_validate_transition seresult=denied"
> @@ -751,9 +758,9 @@ static int security_compute_validatetrans(struct selinux_state *state,
> {
> struct policydb *policydb;
> struct sidtab *sidtab;
> - struct context *ocontext;
> - struct context *ncontext;
> - struct context *tcontext;
> + struct sidtab_entry *oentry;
> + struct sidtab_entry *nentry;
> + struct sidtab_entry *tentry;
> struct class_datum *tclass_datum;
> struct constraint_node *constraint;
> u16 tclass;
> @@ -779,24 +786,24 @@ static int security_compute_validatetrans(struct selinux_state *state,
> }
> tclass_datum = policydb->class_val_to_struct[tclass - 1];
>
> - ocontext = sidtab_search(sidtab, oldsid);
> - if (!ocontext) {
> + oentry = sidtab_search_entry(sidtab, oldsid);
> + if (!oentry) {
> pr_err("SELinux: %s: unrecognized SID %d\n",
> __func__, oldsid);
> rc = -EINVAL;
> goto out;
> }
>
> - ncontext = sidtab_search(sidtab, newsid);
> - if (!ncontext) {
> + nentry = sidtab_search_entry(sidtab, newsid);
> + if (!nentry) {
> pr_err("SELinux: %s: unrecognized SID %d\n",
> __func__, newsid);
> rc = -EINVAL;
> goto out;
> }
>
> - tcontext = sidtab_search(sidtab, tasksid);
> - if (!tcontext) {
> + tentry = sidtab_search_entry(sidtab, tasksid);
> + if (!tentry) {
> pr_err("SELinux: %s: unrecognized SID %d\n",
> __func__, tasksid);
> rc = -EINVAL;
> @@ -805,15 +812,16 @@ static int security_compute_validatetrans(struct selinux_state *state,
>
> constraint = tclass_datum->validatetrans;
> while (constraint) {
> - if (!constraint_expr_eval(policydb, ocontext, ncontext,
> - tcontext, constraint->expr)) {
> + if (!constraint_expr_eval(policydb, &oentry->context,
> + &nentry->context, &tentry->context,
> + constraint->expr)) {
> if (user)
> rc = -EPERM;
> else
> rc = security_validtrans_handle_fail(state,
> - ocontext,
> - ncontext,
> - tcontext,
> + oentry,
> + nentry,
> + tentry,
> tclass);
> goto out;
> }
> @@ -855,7 +863,7 @@ int security_bounded_transition(struct selinux_state *state,
> {
> struct policydb *policydb;
> struct sidtab *sidtab;
> - struct context *old_context, *new_context;
> + struct sidtab_entry *old_entry, *new_entry;
> struct type_datum *type;
> int index;
> int rc;
> @@ -869,16 +877,16 @@ int security_bounded_transition(struct selinux_state *state,
> sidtab = state->ss->sidtab;
>
> rc = -EINVAL;
> - old_context = sidtab_search(sidtab, old_sid);
> - if (!old_context) {
> + old_entry = sidtab_search_entry(sidtab, old_sid);
> + if (!old_entry) {
> pr_err("SELinux: %s: unrecognized SID %u\n",
> __func__, old_sid);
> goto out;
> }
>
> rc = -EINVAL;
> - new_context = sidtab_search(sidtab, new_sid);
> - if (!new_context) {
> + new_entry = sidtab_search_entry(sidtab, new_sid);
> + if (!new_entry) {
> pr_err("SELinux: %s: unrecognized SID %u\n",
> __func__, new_sid);
> goto out;
> @@ -886,10 +894,10 @@ int security_bounded_transition(struct selinux_state *state,
>
> rc = 0;
> /* type/domain unchanged */
> - if (old_context->type == new_context->type)
> + if (old_entry->context.type == new_entry->context.type)
> goto out;
>
> - index = new_context->type;
> + index = new_entry->context.type;
> while (true) {
> type = policydb->type_val_to_struct[index - 1];
> BUG_ON(!type);
> @@ -901,7 +909,7 @@ int security_bounded_transition(struct selinux_state *state,
>
> /* @newsid is bounded by @oldsid */
> rc = 0;
> - if (type->bounds == old_context->type)
> + if (type->bounds == old_entry->context.type)
> break;
>
> index = type->bounds;
> @@ -912,10 +920,10 @@ int security_bounded_transition(struct selinux_state *state,
> char *new_name = NULL;
> u32 length;
>
> - if (!context_struct_to_string(policydb, old_context,
> - &old_name, &length) &&
> - !context_struct_to_string(policydb, new_context,
> - &new_name, &length)) {
> + if (!sidtab_entry_to_string(policydb, sidtab, old_entry,
> + &old_name, &length) &&
> + !sidtab_entry_to_string(policydb, sidtab, new_entry,
> + &new_name, &length)) {
> audit_log(audit_context(),
> GFP_ATOMIC, AUDIT_SELINUX_ERR,
> "op=security_bounded_transition "
> @@ -1255,6 +1263,23 @@ static int context_struct_to_string(struct policydb *p,
> return 0;
> }
>
> +static int sidtab_entry_to_string(struct policydb *p,
> + struct sidtab *sidtab,
> + struct sidtab_entry *entry,
> + char **scontext, u32 *scontext_len)
> +{
> + int rc = sidtab_sid2str_get(sidtab, entry, scontext, scontext_len);
> +
> + if (rc != -ENOENT)
> + return rc;
> +
> + rc = context_struct_to_string(p, &entry->context, scontext,
> + scontext_len);
> + if (!rc && scontext)
> + sidtab_sid2str_put(sidtab, entry, *scontext, *scontext_len);
> + return rc;
> +}
> +
> #include "initial_sid_to_string.h"
>
> const char *security_get_initial_sid_context(u32 sid)
> @@ -1271,7 +1296,7 @@ static int security_sid_to_context_core(struct selinux_state *state,
> {
> struct policydb *policydb;
> struct sidtab *sidtab;
> - struct context *context;
> + struct sidtab_entry *entry;
> int rc = 0;
>
> if (scontext)
> @@ -1302,21 +1327,23 @@ static int security_sid_to_context_core(struct selinux_state *state,
> read_lock(&state->ss->policy_rwlock);
> policydb = &state->ss->policydb;
> sidtab = state->ss->sidtab;
> +
> if (force)
> - context = sidtab_search_force(sidtab, sid);
> + entry = sidtab_search_entry_force(sidtab, sid);
> else
> - context = sidtab_search(sidtab, sid);
> - if (!context) {
> + entry = sidtab_search_entry(sidtab, sid);
> + if (!entry) {
> pr_err("SELinux: %s: unrecognized SID %d\n",
> __func__, sid);
> rc = -EINVAL;
> goto out_unlock;
> }
> - if (only_invalid && !context->len)
> - rc = 0;
> - else
> - rc = context_struct_to_string(policydb, context, scontext,
> - scontext_len);
> + if (only_invalid && !entry->context.len)
> + goto out_unlock;
> +
> + rc = sidtab_entry_to_string(policydb, sidtab, entry, scontext,
> + scontext_len);
> +
> out_unlock:
> read_unlock(&state->ss->policy_rwlock);
> out:
> @@ -1574,19 +1601,20 @@ int security_context_to_sid_force(struct selinux_state *state,
>
> static int compute_sid_handle_invalid_context(
> struct selinux_state *state,
> - struct context *scontext,
> - struct context *tcontext,
> + struct sidtab_entry *sentry,
> + struct sidtab_entry *tentry,
> u16 tclass,
> struct context *newcontext)
> {
> struct policydb *policydb = &state->ss->policydb;
> + struct sidtab *sidtab = state->ss->sidtab;
> char *s = NULL, *t = NULL, *n = NULL;
> u32 slen, tlen, nlen;
> struct audit_buffer *ab;
>
> - if (context_struct_to_string(policydb, scontext, &s, &slen))
> + if (sidtab_entry_to_string(policydb, sidtab, sentry, &s, &slen))
> goto out;
> - if (context_struct_to_string(policydb, tcontext, &t, &tlen))
> + if (sidtab_entry_to_string(policydb, sidtab, tentry, &t, &tlen))
> goto out;
> if (context_struct_to_string(policydb, newcontext, &n, &nlen))
> goto out;
> @@ -1645,7 +1673,8 @@ static int security_compute_sid(struct selinux_state *state,
> struct policydb *policydb;
> struct sidtab *sidtab;
> struct class_datum *cladatum = NULL;
> - struct context *scontext = NULL, *tcontext = NULL, newcontext;
> + struct context *scontext, *tcontext, newcontext;
> + struct sidtab_entry *sentry, *tentry;
> struct role_trans *roletr = NULL;
> struct avtab_key avkey;
> struct avtab_datum *avdatum;
> @@ -1682,21 +1711,24 @@ static int security_compute_sid(struct selinux_state *state,
> policydb = &state->ss->policydb;
> sidtab = state->ss->sidtab;
>
> - scontext = sidtab_search(sidtab, ssid);
> - if (!scontext) {
> + sentry = sidtab_search_entry(sidtab, ssid);
> + if (!sentry) {
> pr_err("SELinux: %s: unrecognized SID %d\n",
> __func__, ssid);
> rc = -EINVAL;
> goto out_unlock;
> }
> - tcontext = sidtab_search(sidtab, tsid);
> - if (!tcontext) {
> + tentry = sidtab_search_entry(sidtab, tsid);
> + if (!tentry) {
> pr_err("SELinux: %s: unrecognized SID %d\n",
> __func__, tsid);
> rc = -EINVAL;
> goto out_unlock;
> }
>
> + scontext = &sentry->context;
> + tcontext = &tentry->context;
> +
> if (tclass && tclass <= policydb->p_classes.nprim)
> cladatum = policydb->class_val_to_struct[tclass - 1];
>
> @@ -1797,10 +1829,8 @@ static int security_compute_sid(struct selinux_state *state,
>
> /* Check the validity of the context. */
> if (!policydb_context_isvalid(policydb, &newcontext)) {
> - rc = compute_sid_handle_invalid_context(state, scontext,
> - tcontext,
> - tclass,
> - &newcontext);
> + rc = compute_sid_handle_invalid_context(state, sentry, tentry,
> + tclass, &newcontext);
> if (rc)
> goto out_unlock;
> }
> diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
> index 7d49994e8d5f..6d6ce1c43b49 100644
> --- a/security/selinux/ss/sidtab.c
> +++ b/security/selinux/ss/sidtab.c
> @@ -9,6 +9,8 @@
> */
> #include <linux/errno.h>
> #include <linux/kernel.h>
> +#include <linux/list.h>
> +#include <linux/rcupdate.h>
> #include <linux/slab.h>
> #include <linux/sched.h>
> #include <linux/spinlock.h>
> @@ -17,6 +19,14 @@
> #include "security.h"
> #include "sidtab.h"
>
> +struct sidtab_str_cache {
> + struct rcu_head rcu_member;
> + struct list_head lru_member;
> + struct sidtab_entry *parent;
> + u32 len;
> + char str[];
> +};
> +
> int sidtab_init(struct sidtab *s)
> {
> u32 i;
> @@ -34,24 +44,33 @@ int sidtab_init(struct sidtab *s)
> s->convert = NULL;
>
> spin_lock_init(&s->lock);
> +
> +#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
> + s->cache_free_slots = CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE;
> + INIT_LIST_HEAD(&s->cache_lru_list);
> + spin_lock_init(&s->cache_lock);
> +#endif
> return 0;
> }
>
> int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context)
> {
> - struct sidtab_isid_entry *entry;
> + struct sidtab_isid_entry *isid;
> int rc;
>
> if (sid == 0 || sid > SECINITSID_NUM)
> return -EINVAL;
>
> - entry = &s->isids[sid - 1];
> + isid = &s->isids[sid - 1];
>
> - rc = context_cpy(&entry->context, context);
> + rc = context_cpy(&isid->entry.context, context);
> if (rc)
> return rc;
>
> - entry->set = 1;
> +#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
> + isid->entry.cache = NULL;
> +#endif
> + isid->set = 1;
> return 0;
> }
>
> @@ -88,7 +107,8 @@ static int sidtab_alloc_roots(struct sidtab *s, u32 level)
> return 0;
> }
>
> -static struct context *sidtab_do_lookup(struct sidtab *s, u32 index, int alloc)
> +static struct sidtab_entry *sidtab_do_lookup(struct sidtab *s, u32 index,
> + int alloc)
> {
> union sidtab_entry_inner *entry;
> u32 level, capacity_shift, leaf_index = index / SIDTAB_LEAF_ENTRIES;
> @@ -125,10 +145,16 @@ static struct context *sidtab_do_lookup(struct sidtab *s, u32 index, int alloc)
> if (!entry->ptr_leaf)
> return NULL;
> }
> - return &entry->ptr_leaf->entries[index % SIDTAB_LEAF_ENTRIES].context;
> + return &entry->ptr_leaf->entries[index % SIDTAB_LEAF_ENTRIES];
> +}
> +
> +/* use when you know that there is enough entries */
> +static struct context *sidtab_lookup_unsafe(struct sidtab *s, u32 index)
> +{
> + return &sidtab_do_lookup(s, index, 0)->context;
> }
>
> -static struct context *sidtab_lookup(struct sidtab *s, u32 index)
> +static struct sidtab_entry *sidtab_lookup(struct sidtab *s, u32 index)
> {
> /* read entries only after reading count */
> u32 count = smp_load_acquire(&s->count);
> @@ -139,33 +165,34 @@ static struct context *sidtab_lookup(struct sidtab *s, u32 index)
> return sidtab_do_lookup(s, index, 0);
> }
>
> -static struct context *sidtab_lookup_initial(struct sidtab *s, u32 sid)
> +static struct sidtab_entry *sidtab_lookup_initial(struct sidtab *s, u32 sid)
> {
> - return s->isids[sid - 1].set ? &s->isids[sid - 1].context : NULL;
> + return s->isids[sid - 1].set ? &s->isids[sid - 1].entry : NULL;
> }
>
> -static struct context *sidtab_search_core(struct sidtab *s, u32 sid, int force)
> +static struct sidtab_entry *sidtab_search_core(struct sidtab *s, u32 sid,
> + int force)
> {
> - struct context *context;
> -
> if (sid != 0) {
> + struct sidtab_entry *entry;
> +
> if (sid > SECINITSID_NUM)
> - context = sidtab_lookup(s, sid - (SECINITSID_NUM + 1));
> + entry = sidtab_lookup(s, sid - (SECINITSID_NUM + 1));
> else
> - context = sidtab_lookup_initial(s, sid);
> - if (context && (!context->len || force))
> - return context;
> + entry = sidtab_lookup_initial(s, sid);
> + if (entry && (!entry->context.len || force))
> + return entry;
> }
>
> return sidtab_lookup_initial(s, SECINITSID_UNLABELED);
> }
>
> -struct context *sidtab_search(struct sidtab *s, u32 sid)
> +struct sidtab_entry *sidtab_search_entry(struct sidtab *s, u32 sid)
> {
> return sidtab_search_core(s, sid, 0);
> }
>
> -struct context *sidtab_search_force(struct sidtab *s, u32 sid)
> +struct sidtab_entry *sidtab_search_entry_force(struct sidtab *s, u32 sid)
> {
> return sidtab_search_core(s, sid, 1);
> }
> @@ -230,7 +257,7 @@ static int sidtab_rcache_search(struct sidtab *s, struct context *context,
> if (v >= SIDTAB_MAX)
> continue;
>
> - if (context_cmp(sidtab_do_lookup(s, v, 0), context)) {
> + if (context_cmp(sidtab_lookup_unsafe(s, v), context)) {
> sidtab_rcache_update(s, v, i);
> *index = v;
> return 0;
> @@ -245,7 +272,7 @@ static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
> unsigned long flags;
> u32 count, count_locked, level, pos;
> struct sidtab_convert_params *convert;
> - struct context *dst, *dst_convert;
> + struct sidtab_entry *dst, *dst_convert;
> int rc;
>
> rc = sidtab_rcache_search(s, context, index);
> @@ -273,7 +300,7 @@ static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
>
> /* if count has changed before we acquired the lock, then catch up */
> while (count < count_locked) {
> - if (context_cmp(sidtab_do_lookup(s, count, 0), context)) {
> + if (context_cmp(sidtab_lookup_unsafe(s, count), context)) {
> sidtab_rcache_push(s, count);
> *index = count;
> rc = 0;
> @@ -293,7 +320,7 @@ static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
> if (!dst)
> goto out_unlock;
>
> - rc = context_cpy(dst, context);
> + rc = context_cpy(&dst->context, context);
> if (rc)
> goto out_unlock;
>
> @@ -305,13 +332,14 @@ static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
> rc = -ENOMEM;
> dst_convert = sidtab_do_lookup(convert->target, count, 1);
> if (!dst_convert) {
> - context_destroy(dst);
> + context_destroy(&dst->context);
> goto out_unlock;
> }
>
> - rc = convert->func(context, dst_convert, convert->args);
> + rc = convert->func(context, &dst_convert->context,
> + convert->args);
> if (rc) {
> - context_destroy(dst);
> + context_destroy(&dst->context);
> goto out_unlock;
> }
>
> @@ -341,9 +369,9 @@ int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid)
> u32 i;
>
> for (i = 0; i < SECINITSID_NUM; i++) {
> - struct sidtab_isid_entry *entry = &s->isids[i];
> + struct sidtab_isid_entry *isid = &s->isids[i];
>
> - if (entry->set && context_cmp(context, &entry->context)) {
> + if (isid->set && context_cmp(context, &isid->entry.context)) {
> *sid = i + 1;
> return 0;
> }
> @@ -453,6 +481,14 @@ int sidtab_convert(struct sidtab *s, struct sidtab_convert_params *params)
> return rc;
> }
>
> +static void sidtab_destroy_entry(struct sidtab_entry *entry)
> +{
> + context_destroy(&entry->context);
> +#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
> + kfree(rcu_dereference_raw(entry->cache));
> +#endif
> +}
> +
> static void sidtab_destroy_tree(union sidtab_entry_inner entry, u32 level)
> {
> u32 i;
> @@ -473,7 +509,7 @@ static void sidtab_destroy_tree(union sidtab_entry_inner entry, u32 level)
> return;
>
> for (i = 0; i < SIDTAB_LEAF_ENTRIES; i++)
> - context_destroy(&node->entries[i].context);
> + sidtab_destroy_entry(&node->entries[i]);
> kfree(node);
> }
> }
> @@ -484,7 +520,7 @@ void sidtab_destroy(struct sidtab *s)
>
> for (i = 0; i < SECINITSID_NUM; i++)
> if (s->isids[i].set)
> - context_destroy(&s->isids[i].context);
> + sidtab_destroy_entry(&s->isids[i].entry);
>
> level = SIDTAB_MAX_LEVEL;
> while (level && !s->roots[level].ptr_inner)
> @@ -492,3 +528,88 @@ void sidtab_destroy(struct sidtab *s)
>
> sidtab_destroy_tree(s->roots[level], level);
> }
> +
> +#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
> +
> +void sidtab_sid2str_put(struct sidtab *s, struct sidtab_entry *entry,
> + const char *str, u32 str_len)
> +{
> + struct sidtab_str_cache *cache, *victim;
> +
> + /* do not cache invalid contexts */
> + if (entry->context.len)
> + return;
> +
> + /*
> + * Skip the put operation when in non-task context to avoid the need
> + * to disable interrupts while holding s->cache_lock.
> + */
> + if (!in_task())
> + return;
> +
> + spin_lock(&s->cache_lock);
> +
> + cache = rcu_dereference_protected(entry->cache,
> + lockdep_is_held(&s->cache_lock));
> + if (cache) {
> + /* entry in cache - just bump to the head of LRU list */
> + list_move(&cache->lru_member, &s->cache_lru_list);
> + goto out_unlock;
> + }
> +
> + cache = kmalloc(sizeof(struct sidtab_str_cache) + str_len, GFP_ATOMIC);
> + if (!cache)
> + goto out_unlock;
> +
> + if (s->cache_free_slots == 0) {
> + /* pop a cache entry from the tail and free it */
> + victim = container_of(s->cache_lru_list.prev,
> + struct sidtab_str_cache, lru_member);
> + list_del(&victim->lru_member);
> + kfree_rcu(victim, rcu_member);
> + rcu_assign_pointer(victim->parent->cache, NULL);
> + } else {
> + s->cache_free_slots--;
> + }
> + cache->parent = entry;
> + cache->len = str_len;
> + memcpy(cache->str, str, str_len);
> + list_add(&cache->lru_member, &s->cache_lru_list);
> +
> + rcu_assign_pointer(entry->cache, cache);
> +
> +out_unlock:
> + spin_unlock(&s->cache_lock);
> +}
> +
> +int sidtab_sid2str_get(struct sidtab *s, struct sidtab_entry *entry,
> + char **out, u32 *out_len)
> +{
> + struct sidtab_str_cache *cache;
> + int rc = 0;
> +
> + if (entry->context.len)
> + return -ENOENT; /* do not cache invalid contexts */
> +
> + rcu_read_lock();
> +
> + cache = rcu_dereference(entry->cache);
> + if (!cache) {
> + rc = -ENOENT;
> + } else {
> + *out_len = cache->len;
> + if (out) {
> + *out = kmemdup(cache->str, cache->len, GFP_ATOMIC);
> + if (!*out)
> + rc = -ENOMEM;
> + }
> + }
> +
> + rcu_read_unlock();
> +
> + if (!rc && out)
> + sidtab_sid2str_put(s, entry, *out, *out_len);
> + return rc;
> +}
> +
> +#endif /* CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0 */
> diff --git a/security/selinux/ss/sidtab.h b/security/selinux/ss/sidtab.h
> index 1f4763141aa1..5fe67a0c307b 100644
> --- a/security/selinux/ss/sidtab.h
> +++ b/security/selinux/ss/sidtab.h
> @@ -16,13 +16,13 @@
>
> #include "context.h"
>
> -struct sidtab_entry_leaf {
> +struct sidtab_entry {
> struct context context;
> +#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
> + struct sidtab_str_cache __rcu *cache;
> +#endif
> };
>
> -struct sidtab_node_inner;
> -struct sidtab_node_leaf;
> -
> union sidtab_entry_inner {
> struct sidtab_node_inner *ptr_inner;
> struct sidtab_node_leaf *ptr_leaf;
> @@ -38,7 +38,7 @@ union sidtab_entry_inner {
> (SIDTAB_NODE_ALLOC_SHIFT - size_to_shift(sizeof(union sidtab_entry_inner)))
> #define SIDTAB_INNER_ENTRIES ((size_t)1 << SIDTAB_INNER_SHIFT)
> #define SIDTAB_LEAF_ENTRIES \
> - (SIDTAB_NODE_ALLOC_SIZE / sizeof(struct sidtab_entry_leaf))
> + (SIDTAB_NODE_ALLOC_SIZE / sizeof(struct sidtab_entry))
>
> #define SIDTAB_MAX_BITS 32
> #define SIDTAB_MAX U32_MAX
> @@ -48,7 +48,7 @@ union sidtab_entry_inner {
> SIDTAB_INNER_SHIFT)
>
> struct sidtab_node_leaf {
> - struct sidtab_entry_leaf entries[SIDTAB_LEAF_ENTRIES];
> + struct sidtab_entry entries[SIDTAB_LEAF_ENTRIES];
> };
>
> struct sidtab_node_inner {
> @@ -57,7 +57,7 @@ struct sidtab_node_inner {
>
> struct sidtab_isid_entry {
> int set;
> - struct context context;
> + struct sidtab_entry entry;
> };
>
> struct sidtab_convert_params {
> @@ -83,6 +83,13 @@ struct sidtab {
> struct sidtab_convert_params *convert;
> spinlock_t lock;
>
> +#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
> + /* SID -> context string cache */
> + u32 cache_free_slots;
> + struct list_head cache_lru_list;
> + spinlock_t cache_lock;
> +#endif
> +
> /* reverse lookup cache - access atomically via {READ|WRITE}_ONCE() */
> u32 rcache[SIDTAB_RCACHE_SIZE];
>
> @@ -92,8 +99,22 @@ struct sidtab {
>
> int sidtab_init(struct sidtab *s);
> int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context);
> -struct context *sidtab_search(struct sidtab *s, u32 sid);
> -struct context *sidtab_search_force(struct sidtab *s, u32 sid);
> +struct sidtab_entry *sidtab_search_entry(struct sidtab *s, u32 sid);
> +struct sidtab_entry *sidtab_search_entry_force(struct sidtab *s, u32 sid);
> +
> +static inline struct context *sidtab_search(struct sidtab *s, u32 sid)
> +{
> + struct sidtab_entry *entry = sidtab_search_entry(s, sid);
> +
> + return entry ? &entry->context : NULL;
> +}
> +
> +static inline struct context *sidtab_search_force(struct sidtab *s, u32 sid)
> +{
> + struct sidtab_entry *entry = sidtab_search_entry_force(s, sid);
> +
> + return entry ? &entry->context : NULL;
> +}
>
> int sidtab_convert(struct sidtab *s, struct sidtab_convert_params *params);
>
> @@ -101,6 +122,25 @@ int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid);
>
> void sidtab_destroy(struct sidtab *s);
>
> +#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
> +void sidtab_sid2str_put(struct sidtab *s, struct sidtab_entry *entry,
> + const char *str, u32 str_len);
> +int sidtab_sid2str_get(struct sidtab *s, struct sidtab_entry *entry,
> + char **out, u32 *out_len);
> +#else
> +static inline void sidtab_sid2str_put(struct sidtab *s,
> + struct sidtab_entry *entry,
> + const char *str, u32 str_len)
> +{
> +}
> +static inline int sidtab_sid2str_get(struct sidtab *s,
> + struct sidtab_entry *entry,
> + char **out, u32 *out_len)
> +{
> + return -ENOENT;
> +}
> +#endif /* CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0 */
> +
> #endif /* _SS_SIDTAB_H_ */
>
>
> --
> 2.21.0
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4] selinux: cache the SID -> context string translation
2019-11-11 15:40 [PATCH v4] selinux: cache the SID -> context string translation Ondrej Mosnacek
2019-11-12 15:23 ` Stephen Smalley
2019-11-13 14:29 ` Paul E. McKenney
@ 2019-11-15 0:42 ` Paul Moore
2019-11-15 14:50 ` Ondrej Mosnacek
2 siblings, 1 reply; 6+ messages in thread
From: Paul Moore @ 2019-11-15 0:42 UTC (permalink / raw)
To: Ondrej Mosnacek
Cc: selinux, Stephen Smalley, rcu, Paul E. McKenney, Michal Sekletar
On Mon, Nov 11, 2019 at 10:40 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> Translating a context struct to string can be quite slow, especially if
> the context has a lot of category bits set. This can cause quite
> noticeable performance impact in situations where the translation needs
> to be done repeatedly. A common example is a UNIX datagram socket with
> the SO_PASSSEC option enabled, which is used e.g. by systemd-journald
> when receiving log messages via datagram socket. This scenario can be
> reproduced with:
>
> cat /dev/urandom | base64 | logger &
> timeout 30s perf record -p $(pidof systemd-journald) -a -g
> kill %1
> perf report -g none --pretty raw | grep security_secid_to_secctx
>
> Before the caching introduced by this patch, computing the context
> string (security_secid_to_secctx() function) takes up ~65% of
> systemd-journald's CPU time (assuming a context with 1024 categories
> set and Fedora x86_64 release kernel configs). After this patch
> (assuming near-perfect cache hit ratio) this overhead is reduced to just
> ~2%.
>
> This patch addresses the issue by caching a certain number (compile-time
> configurable) of recently used context strings to speed up repeated
> translations of the same context, while using only a small amount of
> memory.
>
> The cache is integrated into the existing sidtab table by adding a field
> to each entry, which when not NULL contains an RCU-protected pointer to
> a cache entry containing the cached string. The cache entries are kept
> in a linked list sorted according to how recently they were used. On a
> cache miss when the cache is full, the least recently used entry is
> removed to make space for the new entry.
>
> The patch migrates security_sid_to_context_core() to use the cache (also
> a few other functions where it was possible without too much fuss, but
> these mostly use the translation for logging in case of error, which is
> rare).
>
> Link: https://bugzilla.redhat.com/show_bug.cgi?id=1733259
> Cc: Michal Sekletar <msekleta@redhat.com>
> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> ---
>
> Changes in v4:
> - use rcu_dereference_protected() instead of rcu_dereference_raw() in
> sidtab_sid2str_put()
> - fix typo in comment
> - remove unnecessary rcu_head_init() call
>
> Changes in v3:
> - add rcu@vger.kernel.org and Paul McKenney to Cc for review of the RCU
> logic
> - add __rcu annotation to the cache entry pointer (sidtab.c now passes
> sparse checks with C=1)
>
> Changes in v2:
> - skip sidtab_sid2str_put() when in non-task context to prevent
> deadlock while avoiding the need to lock the spinlock with
> irqsave/-restore (which is slower)
>
> security/selinux/Kconfig | 11 ++
> security/selinux/ss/services.c | 138 +++++++++++++++----------
> security/selinux/ss/sidtab.c | 179 +++++++++++++++++++++++++++------
> security/selinux/ss/sidtab.h | 58 +++++++++--
> 4 files changed, 294 insertions(+), 92 deletions(-)
...
> diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
> index 7d49994e8d5f..6d6ce1c43b49 100644
> --- a/security/selinux/ss/sidtab.c
> +++ b/security/selinux/ss/sidtab.c
> @@ -492,3 +528,88 @@ void sidtab_destroy(struct sidtab *s)
>
> sidtab_destroy_tree(s->roots[level], level);
> }
> +
> +#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
> +
> +void sidtab_sid2str_put(struct sidtab *s, struct sidtab_entry *entry,
> + const char *str, u32 str_len)
> +{
> + struct sidtab_str_cache *cache, *victim;
> +
> + /* do not cache invalid contexts */
> + if (entry->context.len)
> + return;
> +
> + /*
> + * Skip the put operation when in non-task context to avoid the need
> + * to disable interrupts while holding s->cache_lock.
> + */
> + if (!in_task())
> + return;
> +
> + spin_lock(&s->cache_lock);
> +
> + cache = rcu_dereference_protected(entry->cache,
> + lockdep_is_held(&s->cache_lock));
> + if (cache) {
> + /* entry in cache - just bump to the head of LRU list */
> + list_move(&cache->lru_member, &s->cache_lru_list);
> + goto out_unlock;
> + }
> +
> + cache = kmalloc(sizeof(struct sidtab_str_cache) + str_len, GFP_ATOMIC);
> + if (!cache)
> + goto out_unlock;
> +
> + if (s->cache_free_slots == 0) {
> + /* pop a cache entry from the tail and free it */
> + victim = container_of(s->cache_lru_list.prev,
> + struct sidtab_str_cache, lru_member);
> + list_del(&victim->lru_member);
> + kfree_rcu(victim, rcu_member);
We could move the kfree_rcu() down to after we drop the spinlock,
right? It's likely not a big deal, but since the whole point of this
patch is performance improvements it seems like it might be nice. ;)
> + rcu_assign_pointer(victim->parent->cache, NULL);
> + } else {
> + s->cache_free_slots--;
> + }
> + cache->parent = entry;
> + cache->len = str_len;
> + memcpy(cache->str, str, str_len);
> + list_add(&cache->lru_member, &s->cache_lru_list);
> +
> + rcu_assign_pointer(entry->cache, cache);
> +
> +out_unlock:
> + spin_unlock(&s->cache_lock);
> +}
--
paul moore
www.paul-moore.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4] selinux: cache the SID -> context string translation
2019-11-15 0:42 ` Paul Moore
@ 2019-11-15 14:50 ` Ondrej Mosnacek
2019-11-15 15:36 ` Paul Moore
0 siblings, 1 reply; 6+ messages in thread
From: Ondrej Mosnacek @ 2019-11-15 14:50 UTC (permalink / raw)
To: Paul Moore
Cc: SElinux list, Stephen Smalley, rcu, Paul E. McKenney, Michal Sekletar
On Fri, Nov 15, 2019 at 1:42 AM Paul Moore <paul@paul-moore.com> wrote:
> On Mon, Nov 11, 2019 at 10:40 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> > Translating a context struct to string can be quite slow, especially if
> > the context has a lot of category bits set. This can cause quite
> > noticeable performance impact in situations where the translation needs
> > to be done repeatedly. A common example is a UNIX datagram socket with
> > the SO_PASSSEC option enabled, which is used e.g. by systemd-journald
> > when receiving log messages via datagram socket. This scenario can be
> > reproduced with:
> >
> > cat /dev/urandom | base64 | logger &
> > timeout 30s perf record -p $(pidof systemd-journald) -a -g
> > kill %1
> > perf report -g none --pretty raw | grep security_secid_to_secctx
> >
> > Before the caching introduced by this patch, computing the context
> > string (security_secid_to_secctx() function) takes up ~65% of
> > systemd-journald's CPU time (assuming a context with 1024 categories
> > set and Fedora x86_64 release kernel configs). After this patch
> > (assuming near-perfect cache hit ratio) this overhead is reduced to just
> > ~2%.
> >
> > This patch addresses the issue by caching a certain number (compile-time
> > configurable) of recently used context strings to speed up repeated
> > translations of the same context, while using only a small amount of
> > memory.
> >
> > The cache is integrated into the existing sidtab table by adding a field
> > to each entry, which when not NULL contains an RCU-protected pointer to
> > a cache entry containing the cached string. The cache entries are kept
> > in a linked list sorted according to how recently they were used. On a
> > cache miss when the cache is full, the least recently used entry is
> > removed to make space for the new entry.
> >
> > The patch migrates security_sid_to_context_core() to use the cache (also
> > a few other functions where it was possible without too much fuss, but
> > these mostly use the translation for logging in case of error, which is
> > rare).
> >
> > Link: https://bugzilla.redhat.com/show_bug.cgi?id=1733259
> > Cc: Michal Sekletar <msekleta@redhat.com>
> > Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> > ---
> >
> > Changes in v4:
> > - use rcu_dereference_protected() instead of rcu_dereference_raw() in
> > sidtab_sid2str_put()
> > - fix typo in comment
> > - remove unnecessary rcu_head_init() call
> >
> > Changes in v3:
> > - add rcu@vger.kernel.org and Paul McKenney to Cc for review of the RCU
> > logic
> > - add __rcu annotation to the cache entry pointer (sidtab.c now passes
> > sparse checks with C=1)
> >
> > Changes in v2:
> > - skip sidtab_sid2str_put() when in non-task context to prevent
> > deadlock while avoiding the need to lock the spinlock with
> > irqsave/-restore (which is slower)
> >
> > security/selinux/Kconfig | 11 ++
> > security/selinux/ss/services.c | 138 +++++++++++++++----------
> > security/selinux/ss/sidtab.c | 179 +++++++++++++++++++++++++++------
> > security/selinux/ss/sidtab.h | 58 +++++++++--
> > 4 files changed, 294 insertions(+), 92 deletions(-)
>
> ...
>
> > diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
> > index 7d49994e8d5f..6d6ce1c43b49 100644
> > --- a/security/selinux/ss/sidtab.c
> > +++ b/security/selinux/ss/sidtab.c
> > @@ -492,3 +528,88 @@ void sidtab_destroy(struct sidtab *s)
> >
> > sidtab_destroy_tree(s->roots[level], level);
> > }
> > +
> > +#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
> > +
> > +void sidtab_sid2str_put(struct sidtab *s, struct sidtab_entry *entry,
> > + const char *str, u32 str_len)
> > +{
> > + struct sidtab_str_cache *cache, *victim;
> > +
> > + /* do not cache invalid contexts */
> > + if (entry->context.len)
> > + return;
> > +
> > + /*
> > + * Skip the put operation when in non-task context to avoid the need
> > + * to disable interrupts while holding s->cache_lock.
> > + */
> > + if (!in_task())
> > + return;
> > +
> > + spin_lock(&s->cache_lock);
> > +
> > + cache = rcu_dereference_protected(entry->cache,
> > + lockdep_is_held(&s->cache_lock));
> > + if (cache) {
> > + /* entry in cache - just bump to the head of LRU list */
> > + list_move(&cache->lru_member, &s->cache_lru_list);
> > + goto out_unlock;
> > + }
> > +
> > + cache = kmalloc(sizeof(struct sidtab_str_cache) + str_len, GFP_ATOMIC);
> > + if (!cache)
> > + goto out_unlock;
> > +
> > + if (s->cache_free_slots == 0) {
> > + /* pop a cache entry from the tail and free it */
> > + victim = container_of(s->cache_lru_list.prev,
> > + struct sidtab_str_cache, lru_member);
> > + list_del(&victim->lru_member);
> > + kfree_rcu(victim, rcu_member);
>
> We could move the kfree_rcu() down to after we drop the spinlock,
> right? It's likely not a big deal, but since the whole point of this
> patch is performance improvements it seems like it might be nice. ;)
I could be wrong, but I think kfree_rcu() just (always?) appends the
object to the RCU list and defers the deallocation for later (and that
should be pretty quick). But actually... since the kfree_rcu() is not
called under RCU read lock here, I should at least move it below the
next line, which still dereferences "victim". And at that point I
could move it all the way after spin_unlock() as you suggest...
>
> > + rcu_assign_pointer(victim->parent->cache, NULL);
> > + } else {
> > + s->cache_free_slots--;
> > + }
> > + cache->parent = entry;
> > + cache->len = str_len;
> > + memcpy(cache->str, str, str_len);
> > + list_add(&cache->lru_member, &s->cache_lru_list);
> > +
> > + rcu_assign_pointer(entry->cache, cache);
> > +
> > +out_unlock:
> > + spin_unlock(&s->cache_lock);
> > +}
>
> --
> paul moore
> www.paul-moore.com
--
Ondrej Mosnacek <omosnace at redhat dot com>
Software Engineer, Security Technologies
Red Hat, Inc.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4] selinux: cache the SID -> context string translation
2019-11-15 14:50 ` Ondrej Mosnacek
@ 2019-11-15 15:36 ` Paul Moore
0 siblings, 0 replies; 6+ messages in thread
From: Paul Moore @ 2019-11-15 15:36 UTC (permalink / raw)
To: Ondrej Mosnacek
Cc: SElinux list, Stephen Smalley, rcu, Paul E. McKenney, Michal Sekletar
On Fri, Nov 15, 2019 at 9:50 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> On Fri, Nov 15, 2019 at 1:42 AM Paul Moore <paul@paul-moore.com> wrote:
> > On Mon, Nov 11, 2019 at 10:40 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> > > Translating a context struct to string can be quite slow, especially if
> > > the context has a lot of category bits set. This can cause quite
> > > noticeable performance impact in situations where the translation needs
> > > to be done repeatedly. A common example is a UNIX datagram socket with
> > > the SO_PASSSEC option enabled, which is used e.g. by systemd-journald
> > > when receiving log messages via datagram socket. This scenario can be
> > > reproduced with:
> > >
> > > cat /dev/urandom | base64 | logger &
> > > timeout 30s perf record -p $(pidof systemd-journald) -a -g
> > > kill %1
> > > perf report -g none --pretty raw | grep security_secid_to_secctx
> > >
> > > Before the caching introduced by this patch, computing the context
> > > string (security_secid_to_secctx() function) takes up ~65% of
> > > systemd-journald's CPU time (assuming a context with 1024 categories
> > > set and Fedora x86_64 release kernel configs). After this patch
> > > (assuming near-perfect cache hit ratio) this overhead is reduced to just
> > > ~2%.
> > >
> > > This patch addresses the issue by caching a certain number (compile-time
> > > configurable) of recently used context strings to speed up repeated
> > > translations of the same context, while using only a small amount of
> > > memory.
> > >
> > > The cache is integrated into the existing sidtab table by adding a field
> > > to each entry, which when not NULL contains an RCU-protected pointer to
> > > a cache entry containing the cached string. The cache entries are kept
> > > in a linked list sorted according to how recently they were used. On a
> > > cache miss when the cache is full, the least recently used entry is
> > > removed to make space for the new entry.
> > >
> > > The patch migrates security_sid_to_context_core() to use the cache (also
> > > a few other functions where it was possible without too much fuss, but
> > > these mostly use the translation for logging in case of error, which is
> > > rare).
> > >
> > > Link: https://bugzilla.redhat.com/show_bug.cgi?id=1733259
> > > Cc: Michal Sekletar <msekleta@redhat.com>
> > > Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> > > ---
> > >
> > > Changes in v4:
> > > - use rcu_dereference_protected() instead of rcu_dereference_raw() in
> > > sidtab_sid2str_put()
> > > - fix typo in comment
> > > - remove unnecessary rcu_head_init() call
> > >
> > > Changes in v3:
> > > - add rcu@vger.kernel.org and Paul McKenney to Cc for review of the RCU
> > > logic
> > > - add __rcu annotation to the cache entry pointer (sidtab.c now passes
> > > sparse checks with C=1)
> > >
> > > Changes in v2:
> > > - skip sidtab_sid2str_put() when in non-task context to prevent
> > > deadlock while avoiding the need to lock the spinlock with
> > > irqsave/-restore (which is slower)
> > >
> > > security/selinux/Kconfig | 11 ++
> > > security/selinux/ss/services.c | 138 +++++++++++++++----------
> > > security/selinux/ss/sidtab.c | 179 +++++++++++++++++++++++++++------
> > > security/selinux/ss/sidtab.h | 58 +++++++++--
> > > 4 files changed, 294 insertions(+), 92 deletions(-)
> >
> > ...
> >
> > > diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
> > > index 7d49994e8d5f..6d6ce1c43b49 100644
> > > --- a/security/selinux/ss/sidtab.c
> > > +++ b/security/selinux/ss/sidtab.c
> > > @@ -492,3 +528,88 @@ void sidtab_destroy(struct sidtab *s)
> > >
> > > sidtab_destroy_tree(s->roots[level], level);
> > > }
> > > +
> > > +#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0
> > > +
> > > +void sidtab_sid2str_put(struct sidtab *s, struct sidtab_entry *entry,
> > > + const char *str, u32 str_len)
> > > +{
> > > + struct sidtab_str_cache *cache, *victim;
> > > +
> > > + /* do not cache invalid contexts */
> > > + if (entry->context.len)
> > > + return;
> > > +
> > > + /*
> > > + * Skip the put operation when in non-task context to avoid the need
> > > + * to disable interrupts while holding s->cache_lock.
> > > + */
> > > + if (!in_task())
> > > + return;
> > > +
> > > + spin_lock(&s->cache_lock);
> > > +
> > > + cache = rcu_dereference_protected(entry->cache,
> > > + lockdep_is_held(&s->cache_lock));
> > > + if (cache) {
> > > + /* entry in cache - just bump to the head of LRU list */
> > > + list_move(&cache->lru_member, &s->cache_lru_list);
> > > + goto out_unlock;
> > > + }
> > > +
> > > + cache = kmalloc(sizeof(struct sidtab_str_cache) + str_len, GFP_ATOMIC);
> > > + if (!cache)
> > > + goto out_unlock;
> > > +
> > > + if (s->cache_free_slots == 0) {
> > > + /* pop a cache entry from the tail and free it */
> > > + victim = container_of(s->cache_lru_list.prev,
> > > + struct sidtab_str_cache, lru_member);
> > > + list_del(&victim->lru_member);
> > > + kfree_rcu(victim, rcu_member);
> >
> > We could move the kfree_rcu() down to after we drop the spinlock,
> > right? It's likely not a big deal, but since the whole point of this
> > patch is performance improvements it seems like it might be nice. ;)
>
> I could be wrong, but I think kfree_rcu() just (always?) appends the
> object to the RCU list and defers the deallocation for later (and that
> should be pretty quick). But actually... since the kfree_rcu() is not
> called under RCU read lock here, I should at least move it below the
> next line, which still dereferences "victim". And at that point I
> could move it all the way after spin_unlock() as you suggest...
Yes, the bulk of the work is handled later once it is safe to free the
memory, but that doesn't mean work doesn't still happen :)
It's definitely a nitpicky thing, but since we are already at -rc7 and
this isn't a bug-fix, this was always going to land in selinux/next
after the upcoming merge window so we've got time for a respin.
--
paul moore
www.paul-moore.com
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-11-15 15:37 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-11 15:40 [PATCH v4] selinux: cache the SID -> context string translation Ondrej Mosnacek
2019-11-12 15:23 ` Stephen Smalley
2019-11-13 14:29 ` Paul E. McKenney
2019-11-15 0:42 ` Paul Moore
2019-11-15 14:50 ` Ondrej Mosnacek
2019-11-15 15:36 ` Paul Moore
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).