selinux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/4] Fix ENOMEM errors during policy reload
@ 2018-11-27 10:36 Ondrej Mosnacek
  2018-11-27 10:36 ` [RFC PATCH v2 1/4] selinux: use separate table for initial SID lookup Ondrej Mosnacek
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Ondrej Mosnacek @ 2018-11-27 10:36 UTC (permalink / raw)
  To: selinux, Paul Moore; +Cc: Stephen Smalley, Ondrej Mosnacek

This patchset is a second iteration of this patchset:
https://lore.kernel.org/selinux/20181113135255.26045-1-omosnace@redhat.com/T/

The patches that have "[squash]" in the subject are posted separately for ease
of review but will be folded into the previous patch in the final posting.

Changes in v2:
- dropped the first patch since it is already merged in -next
- fixed initial SID table to not reserve an entry for SECSID_NULL
- added back an equivalent of the reverse lookup cache
- fixed checkpatch.pl errors and some warnings

Notes on the reverse lookup cache:
I updated the sidtab insertion measurements [1] with data for the new patchset
(the difference being the cache addition) and I also measured the base code
with the reverse lookup cache removed [3] for comparison. Note that in this
case (adding a new context) the cache will never give a hit so this only
measures the overhead of traversing the whole cache. It seems that both cache
implementations have similar overhead, the new one being a bit slower (but it
has one entry more so it is expected).

I don't have any data on how the cache helps during regular reverse lookups,
since I couldn't come up with any good way to measure that.

Testing:
A Fedora 29 kernel with this patchset applied passes selinux-testsuite
and fixes GH issue #38 [2]. Additionaly, I decided to eat my own dog food and I
have now been running this kernel on my F29 work machine for several days. So
far everything runs smoothly and I haven't noticed any functional or
performance issues.

[1] https://docs.google.com/spreadsheets/d/e/2PACX-1vRUArNJR6kckm2SEs4dRZlijNVdCTmsNuWRGe7X3fC01YkBHpxXHnmcssxEiMF3Z7ivtXN2L5MC0ry-/pubhtml
[2] https://github.com/SELinuxProject/selinux-kernel/issues/38
[3] https://gitlab.com/omos/linux-public/commit/2c1ebcb8fbdcc6400734ba8d1c3015bd446e6fa5

Ondrej Mosnacek (4):
  selinux: use separate table for initial SID lookup
  [squash] do not store entry for SECSID_NULL
  selinux: overhaul sidtab to fix bug and improve performance
  [squash] add back reverse lookup cache to sidtab

 security/selinux/ss/mls.c      |  23 +-
 security/selinux/ss/mls.h      |   3 +-
 security/selinux/ss/policydb.c |  10 +-
 security/selinux/ss/services.c | 170 +++++----
 security/selinux/ss/services.h |   2 +-
 security/selinux/ss/sidtab.c   | 616 +++++++++++++++++++++------------
 security/selinux/ss/sidtab.h   |  95 +++--
 7 files changed, 560 insertions(+), 359 deletions(-)

-- 
2.19.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH v2 1/4] selinux: use separate table for initial SID lookup
  2018-11-27 10:36 [RFC PATCH v2 0/4] Fix ENOMEM errors during policy reload Ondrej Mosnacek
@ 2018-11-27 10:36 ` Ondrej Mosnacek
  2018-11-27 10:36 ` [RFC PATCH v2 2/4] [squash] do not store entry for SECSID_NULL Ondrej Mosnacek
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Ondrej Mosnacek @ 2018-11-27 10:36 UTC (permalink / raw)
  To: selinux, Paul Moore; +Cc: Stephen Smalley, Ondrej Mosnacek

This moves handling of initial SIDs into a separate table. Note that the
SIDs stored in the main table are now shifted by SECINITSID_NUM and
converted to/from the actual SIDs transparently by helper functions.

This change doesn't make much sense on its own, but it simplifies
further sidtab overhaul in a succeeding patch.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
---
 security/selinux/ss/policydb.c |  10 ++-
 security/selinux/ss/services.c |  88 ++++++++++--------
 security/selinux/ss/services.h |   2 +-
 security/selinux/ss/sidtab.c   | 158 +++++++++++++++++++--------------
 security/selinux/ss/sidtab.h   |  14 +--
 5 files changed, 162 insertions(+), 110 deletions(-)

diff --git a/security/selinux/ss/policydb.c b/security/selinux/ss/policydb.c
index b63ef865ce1e..59359fa0bd74 100644
--- a/security/selinux/ss/policydb.c
+++ b/security/selinux/ss/policydb.c
@@ -909,13 +909,21 @@ int policydb_load_isids(struct policydb *p, struct sidtab *s)
 		if (!c->context[0].user) {
 			pr_err("SELinux:  SID %s was never defined.\n",
 				c->u.name);
+			sidtab_destroy(s);
+			goto out;
+		}
+		if (c->sid[0] > SECINITSID_NUM) {
+			pr_err("SELinux:  Initial SID %s out of range.\n",
+				c->u.name);
+			sidtab_destroy(s);
 			goto out;
 		}
 
-		rc = sidtab_insert(s, c->sid[0], &c->context[0]);
+		rc = sidtab_set_initial(s, c->sid[0], &c->context[0]);
 		if (rc) {
 			pr_err("SELinux:  unable to load initial SID %s.\n",
 				c->u.name);
+			sidtab_destroy(s);
 			goto out;
 		}
 	}
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
index 7337db24a6a8..30170d4c567a 100644
--- a/security/selinux/ss/services.c
+++ b/security/selinux/ss/services.c
@@ -776,7 +776,7 @@ static int security_compute_validatetrans(struct selinux_state *state,
 	read_lock(&state->ss->policy_rwlock);
 
 	policydb = &state->ss->policydb;
-	sidtab = &state->ss->sidtab;
+	sidtab = state->ss->sidtab;
 
 	if (!user)
 		tclass = unmap_class(&state->ss->map, orig_tclass);
@@ -876,7 +876,7 @@ int security_bounded_transition(struct selinux_state *state,
 	read_lock(&state->ss->policy_rwlock);
 
 	policydb = &state->ss->policydb;
-	sidtab = &state->ss->sidtab;
+	sidtab = state->ss->sidtab;
 
 	rc = -EINVAL;
 	old_context = sidtab_search(sidtab, old_sid);
@@ -1034,7 +1034,7 @@ void security_compute_xperms_decision(struct selinux_state *state,
 		goto allow;
 
 	policydb = &state->ss->policydb;
-	sidtab = &state->ss->sidtab;
+	sidtab = state->ss->sidtab;
 
 	scontext = sidtab_search(sidtab, ssid);
 	if (!scontext) {
@@ -1123,7 +1123,7 @@ void security_compute_av(struct selinux_state *state,
 		goto allow;
 
 	policydb = &state->ss->policydb;
-	sidtab = &state->ss->sidtab;
+	sidtab = state->ss->sidtab;
 
 	scontext = sidtab_search(sidtab, ssid);
 	if (!scontext) {
@@ -1177,7 +1177,7 @@ void security_compute_av_user(struct selinux_state *state,
 		goto allow;
 
 	policydb = &state->ss->policydb;
-	sidtab = &state->ss->sidtab;
+	sidtab = state->ss->sidtab;
 
 	scontext = sidtab_search(sidtab, ssid);
 	if (!scontext) {
@@ -1315,7 +1315,7 @@ static int security_sid_to_context_core(struct selinux_state *state,
 	}
 	read_lock(&state->ss->policy_rwlock);
 	policydb = &state->ss->policydb;
-	sidtab = &state->ss->sidtab;
+	sidtab = state->ss->sidtab;
 	if (force)
 		context = sidtab_search_force(sidtab, sid);
 	else
@@ -1483,7 +1483,7 @@ static int security_context_to_sid_core(struct selinux_state *state,
 	}
 	read_lock(&state->ss->policy_rwlock);
 	policydb = &state->ss->policydb;
-	sidtab = &state->ss->sidtab;
+	sidtab = state->ss->sidtab;
 	rc = string_to_context_struct(policydb, sidtab, scontext2,
 				      &context, def_sid);
 	if (rc == -EINVAL && force) {
@@ -1668,7 +1668,7 @@ static int security_compute_sid(struct selinux_state *state,
 	}
 
 	policydb = &state->ss->policydb;
-	sidtab = &state->ss->sidtab;
+	sidtab = state->ss->sidtab;
 
 	scontext = sidtab_search(sidtab, ssid);
 	if (!scontext) {
@@ -1925,10 +1925,7 @@ static int convert_context(u32 key,
 	struct user_datum *usrdatum;
 	char *s;
 	u32 len;
-	int rc = 0;
-
-	if (key <= SECINITSID_NUM)
-		goto out;
+	int rc;
 
 	args = p;
 
@@ -2090,9 +2087,8 @@ static int security_preserve_bools(struct selinux_state *state,
 int security_load_policy(struct selinux_state *state, void *data, size_t len)
 {
 	struct policydb *policydb;
-	struct sidtab *sidtab;
+	struct sidtab *oldsidtab, *newsidtab;
 	struct policydb *oldpolicydb, *newpolicydb;
-	struct sidtab oldsidtab, newsidtab;
 	struct selinux_mapping *oldmapping;
 	struct selinux_map newmap;
 	struct convert_context_args args;
@@ -2108,27 +2104,37 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
 	newpolicydb = oldpolicydb + 1;
 
 	policydb = &state->ss->policydb;
-	sidtab = &state->ss->sidtab;
+
+	newsidtab = kmalloc(sizeof(*newsidtab), GFP_KERNEL);
+	if (!newsidtab) {
+		rc = -ENOMEM;
+		goto out;
+	}
 
 	if (!state->initialized) {
 		rc = policydb_read(policydb, fp);
-		if (rc)
+		if (rc) {
+			kfree(newsidtab);
 			goto out;
+		}
 
 		policydb->len = len;
 		rc = selinux_set_mapping(policydb, secclass_map,
 					 &state->ss->map);
 		if (rc) {
+			kfree(newsidtab);
 			policydb_destroy(policydb);
 			goto out;
 		}
 
-		rc = policydb_load_isids(policydb, sidtab);
+		rc = policydb_load_isids(policydb, newsidtab);
 		if (rc) {
+			kfree(newsidtab);
 			policydb_destroy(policydb);
 			goto out;
 		}
 
+		state->ss->sidtab = newsidtab;
 		security_load_policycaps(state);
 		state->initialized = 1;
 		seqno = ++state->ss->latest_granting;
@@ -2141,13 +2147,17 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
 		goto out;
 	}
 
+	oldsidtab = state->ss->sidtab;
+
 #if 0
-	sidtab_hash_eval(sidtab, "sids");
+	sidtab_hash_eval(oldsidtab, "sids");
 #endif
 
 	rc = policydb_read(newpolicydb, fp);
-	if (rc)
+	if (rc) {
+		kfree(newsidtab);
 		goto out;
+	}
 
 	newpolicydb->len = len;
 	/* If switching between different policy types, log MLS status */
@@ -2156,10 +2166,11 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
 	else if (!policydb->mls_enabled && newpolicydb->mls_enabled)
 		pr_info("SELinux: Enabling MLS support...\n");
 
-	rc = policydb_load_isids(newpolicydb, &newsidtab);
+	rc = policydb_load_isids(newpolicydb, newsidtab);
 	if (rc) {
 		pr_err("SELinux:  unable to load the initial SIDs\n");
 		policydb_destroy(newpolicydb);
+		kfree(newsidtab);
 		goto out;
 	}
 
@@ -2180,7 +2191,7 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
 	args.state = state;
 	args.oldp = policydb;
 	args.newp = newpolicydb;
-	rc = sidtab_convert(sidtab, &newsidtab, convert_context, &args);
+	rc = sidtab_convert(oldsidtab, newsidtab, convert_context, &args);
 	if (rc) {
 		pr_err("SELinux:  unable to convert the internal"
 			" representation of contexts in the new SID"
@@ -2190,12 +2201,11 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
 
 	/* Save the old policydb and SID table to free later. */
 	memcpy(oldpolicydb, policydb, sizeof(*policydb));
-	sidtab_set(&oldsidtab, sidtab);
 
 	/* Install the new policydb and SID table. */
 	write_lock_irq(&state->ss->policy_rwlock);
 	memcpy(policydb, newpolicydb, sizeof(*policydb));
-	sidtab_set(sidtab, &newsidtab);
+	state->ss->sidtab = newsidtab;
 	security_load_policycaps(state);
 	oldmapping = state->ss->map.mapping;
 	state->ss->map.mapping = newmap.mapping;
@@ -2205,7 +2215,8 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
 
 	/* Free the old policydb and SID table. */
 	policydb_destroy(oldpolicydb);
-	sidtab_destroy(&oldsidtab);
+	sidtab_destroy(oldsidtab);
+	kfree(oldsidtab);
 	kfree(oldmapping);
 
 	avc_ss_reset(state->avc, seqno);
@@ -2219,7 +2230,8 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
 
 err:
 	kfree(newmap.mapping);
-	sidtab_destroy(&newsidtab);
+	sidtab_destroy(newsidtab);
+	kfree(newsidtab);
 	policydb_destroy(newpolicydb);
 
 out:
@@ -2256,7 +2268,7 @@ int security_port_sid(struct selinux_state *state,
 	read_lock(&state->ss->policy_rwlock);
 
 	policydb = &state->ss->policydb;
-	sidtab = &state->ss->sidtab;
+	sidtab = state->ss->sidtab;
 
 	c = policydb->ocontexts[OCON_PORT];
 	while (c) {
@@ -2302,7 +2314,7 @@ int security_ib_pkey_sid(struct selinux_state *state,
 	read_lock(&state->ss->policy_rwlock);
 
 	policydb = &state->ss->policydb;
-	sidtab = &state->ss->sidtab;
+	sidtab = state->ss->sidtab;
 
 	c = policydb->ocontexts[OCON_IBPKEY];
 	while (c) {
@@ -2348,7 +2360,7 @@ int security_ib_endport_sid(struct selinux_state *state,
 	read_lock(&state->ss->policy_rwlock);
 
 	policydb = &state->ss->policydb;
-	sidtab = &state->ss->sidtab;
+	sidtab = state->ss->sidtab;
 
 	c = policydb->ocontexts[OCON_IBENDPORT];
 	while (c) {
@@ -2394,7 +2406,7 @@ int security_netif_sid(struct selinux_state *state,
 	read_lock(&state->ss->policy_rwlock);
 
 	policydb = &state->ss->policydb;
-	sidtab = &state->ss->sidtab;
+	sidtab = state->ss->sidtab;
 
 	c = policydb->ocontexts[OCON_NETIF];
 	while (c) {
@@ -2459,7 +2471,7 @@ int security_node_sid(struct selinux_state *state,
 	read_lock(&state->ss->policy_rwlock);
 
 	policydb = &state->ss->policydb;
-	sidtab = &state->ss->sidtab;
+	sidtab = state->ss->sidtab;
 
 	switch (domain) {
 	case AF_INET: {
@@ -2559,7 +2571,7 @@ int security_get_user_sids(struct selinux_state *state,
 	read_lock(&state->ss->policy_rwlock);
 
 	policydb = &state->ss->policydb;
-	sidtab = &state->ss->sidtab;
+	sidtab = state->ss->sidtab;
 
 	context_init(&usercon);
 
@@ -2661,7 +2673,7 @@ static inline int __security_genfs_sid(struct selinux_state *state,
 				       u32 *sid)
 {
 	struct policydb *policydb = &state->ss->policydb;
-	struct sidtab *sidtab = &state->ss->sidtab;
+	struct sidtab *sidtab = state->ss->sidtab;
 	int len;
 	u16 sclass;
 	struct genfs *genfs;
@@ -2747,7 +2759,7 @@ int security_fs_use(struct selinux_state *state, struct super_block *sb)
 	read_lock(&state->ss->policy_rwlock);
 
 	policydb = &state->ss->policydb;
-	sidtab = &state->ss->sidtab;
+	sidtab = state->ss->sidtab;
 
 	c = policydb->ocontexts[OCON_FSUSE];
 	while (c) {
@@ -2953,7 +2965,7 @@ int security_sid_mls_copy(struct selinux_state *state,
 			  u32 sid, u32 mls_sid, u32 *new_sid)
 {
 	struct policydb *policydb = &state->ss->policydb;
-	struct sidtab *sidtab = &state->ss->sidtab;
+	struct sidtab *sidtab = state->ss->sidtab;
 	struct context *context1;
 	struct context *context2;
 	struct context newcon;
@@ -3044,7 +3056,7 @@ int security_net_peersid_resolve(struct selinux_state *state,
 				 u32 *peer_sid)
 {
 	struct policydb *policydb = &state->ss->policydb;
-	struct sidtab *sidtab = &state->ss->sidtab;
+	struct sidtab *sidtab = state->ss->sidtab;
 	int rc;
 	struct context *nlbl_ctx;
 	struct context *xfrm_ctx;
@@ -3405,7 +3417,7 @@ int selinux_audit_rule_match(u32 sid, u32 field, u32 op, void *vrule,
 		goto out;
 	}
 
-	ctxt = sidtab_search(&state->ss->sidtab, sid);
+	ctxt = sidtab_search(state->ss->sidtab, sid);
 	if (unlikely(!ctxt)) {
 		WARN_ONCE(1, "selinux_audit_rule_match: unrecognized SID %d\n",
 			  sid);
@@ -3568,7 +3580,7 @@ int security_netlbl_secattr_to_sid(struct selinux_state *state,
 				   u32 *sid)
 {
 	struct policydb *policydb = &state->ss->policydb;
-	struct sidtab *sidtab = &state->ss->sidtab;
+	struct sidtab *sidtab = state->ss->sidtab;
 	int rc;
 	struct context *ctx;
 	struct context ctx_new;
@@ -3646,7 +3658,7 @@ int security_netlbl_sid_to_secattr(struct selinux_state *state,
 	read_lock(&state->ss->policy_rwlock);
 
 	rc = -ENOENT;
-	ctx = sidtab_search(&state->ss->sidtab, sid);
+	ctx = sidtab_search(state->ss->sidtab, sid);
 	if (ctx == NULL)
 		goto out;
 
diff --git a/security/selinux/ss/services.h b/security/selinux/ss/services.h
index 24c7bdcc8075..9a36de860368 100644
--- a/security/selinux/ss/services.h
+++ b/security/selinux/ss/services.h
@@ -24,7 +24,7 @@ struct selinux_map {
 };
 
 struct selinux_ss {
-	struct sidtab sidtab;
+	struct sidtab *sidtab;
 	struct policydb policydb;
 	rwlock_t policy_rwlock;
 	u32 latest_granting;
diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
index ccc0ea230df4..fd8115b211a6 100644
--- a/security/selinux/ss/sidtab.c
+++ b/security/selinux/ss/sidtab.c
@@ -22,16 +22,24 @@ int sidtab_init(struct sidtab *s)
 	s->htable = kmalloc_array(SIDTAB_SIZE, sizeof(*s->htable), GFP_ATOMIC);
 	if (!s->htable)
 		return -ENOMEM;
+
+	for (i = 0; i <= SECINITSID_NUM; i++)
+		s->isids[i].set = 0;
+
 	for (i = 0; i < SIDTAB_SIZE; i++)
 		s->htable[i] = NULL;
+
+	for (i = 0; i < SIDTAB_CACHE_LEN; i++)
+		s->cache[i] = NULL;
+
 	s->nel = 0;
-	s->next_sid = 1;
+	s->next_sid = 0;
 	s->shutdown = 0;
 	spin_lock_init(&s->lock);
 	return 0;
 }
 
-int sidtab_insert(struct sidtab *s, u32 sid, struct context *context)
+static int sidtab_insert(struct sidtab *s, u32 sid, struct context *context)
 {
 	int hvalue;
 	struct sidtab_node *prev, *cur, *newnode;
@@ -76,34 +84,52 @@ int sidtab_insert(struct sidtab *s, u32 sid, struct context *context)
 	return 0;
 }
 
-static struct context *sidtab_search_core(struct sidtab *s, u32 sid, int force)
+int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context)
+{
+	struct sidtab_isid_entry *entry = &s->isids[sid];
+	int rc = context_cpy(&entry->context, context);
+	if (rc)
+		return rc;
+
+	entry->set = 1;
+	return 0;
+}
+
+static struct context *sidtab_lookup(struct sidtab *s, u32 sid)
 {
 	int hvalue;
 	struct sidtab_node *cur;
 
-	if (!s)
-		return NULL;
-
 	hvalue = SIDTAB_HASH(sid);
 	cur = s->htable[hvalue];
 	while (cur && sid > cur->sid)
 		cur = cur->next;
 
-	if (force && cur && sid == cur->sid && cur->context.len)
-		return &cur->context;
+	if (!cur || sid != cur->sid)
+		return NULL;
 
-	if (!cur || sid != cur->sid || cur->context.len) {
-		/* Remap invalid SIDs to the unlabeled SID. */
-		sid = SECINITSID_UNLABELED;
-		hvalue = SIDTAB_HASH(sid);
-		cur = s->htable[hvalue];
-		while (cur && sid > cur->sid)
-			cur = cur->next;
-		if (!cur || sid != cur->sid)
-			return NULL;
+	return &cur->context;
+}
+
+static struct context *sidtab_search_core(struct sidtab *s, u32 sid, int force)
+{
+	struct context *context;
+	struct sidtab_isid_entry *entry;
+
+	if (!s)
+		return NULL;
+
+	if (sid > SECINITSID_NUM) {
+		context = sidtab_lookup(s, sid - (SECINITSID_NUM + 1));
+	} else {
+		entry = &s->isids[sid];
+		context = entry->set ? &entry->context : NULL;
 	}
+	if (context && (!context->len || force))
+		return context;
 
-	return &cur->context;
+	entry = &s->isids[SECINITSID_UNLABELED];
+	return entry->set ? &entry->context : NULL;
 }
 
 struct context *sidtab_search(struct sidtab *s, u32 sid)
@@ -145,11 +171,7 @@ out:
 static int clone_sid(u32 sid, struct context *context, void *arg)
 {
 	struct sidtab *s = arg;
-
-	if (sid > SECINITSID_NUM)
-		return sidtab_insert(s, sid, context);
-	else
-		return 0;
+	return sidtab_insert(s, sid, context);
 }
 
 int sidtab_convert(struct sidtab *s, struct sidtab *news,
@@ -183,8 +205,8 @@ static void sidtab_update_cache(struct sidtab *s, struct sidtab_node *n, int loc
 	s->cache[0] = n;
 }
 
-static inline u32 sidtab_search_context(struct sidtab *s,
-						  struct context *context)
+static inline int sidtab_search_context(struct sidtab *s,
+					struct context *context, u32 *sid)
 {
 	int i;
 	struct sidtab_node *cur;
@@ -194,15 +216,17 @@ static inline u32 sidtab_search_context(struct sidtab *s,
 		while (cur) {
 			if (context_cmp(&cur->context, context)) {
 				sidtab_update_cache(s, cur, SIDTAB_CACHE_LEN - 1);
-				return cur->sid;
+				*sid = cur->sid;
+				return 0;
 			}
 			cur = cur->next;
 		}
 	}
-	return 0;
+	return -ENOENT;
 }
 
-static inline u32 sidtab_search_cache(struct sidtab *s, struct context *context)
+static inline int sidtab_search_cache(struct sidtab *s, struct context *context,
+				      u32 *sid)
 {
 	int i;
 	struct sidtab_node *node;
@@ -210,54 +234,68 @@ static inline u32 sidtab_search_cache(struct sidtab *s, struct context *context)
 	for (i = 0; i < SIDTAB_CACHE_LEN; i++) {
 		node = s->cache[i];
 		if (unlikely(!node))
-			return 0;
+			return -ENOENT;
 		if (context_cmp(&node->context, context)) {
 			sidtab_update_cache(s, node, i);
-			return node->sid;
+			*sid = node->sid;
+			return 0;
 		}
 	}
-	return 0;
+	return -ENOENT;
 }
 
-int sidtab_context_to_sid(struct sidtab *s,
-			  struct context *context,
-			  u32 *out_sid)
+static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
+				 u32 *sid)
 {
-	u32 sid;
-	int ret = 0;
+	int ret;
 	unsigned long flags;
 
-	*out_sid = SECSID_NULL;
-
-	sid  = sidtab_search_cache(s, context);
-	if (!sid)
-		sid = sidtab_search_context(s, context);
-	if (!sid) {
+	ret = sidtab_search_cache(s, context, sid);
+	if (ret)
+		ret = sidtab_search_context(s, context, sid);
+	if (ret) {
 		spin_lock_irqsave(&s->lock, flags);
 		/* Rescan now that we hold the lock. */
-		sid = sidtab_search_context(s, context);
-		if (sid)
+		ret = sidtab_search_context(s, context, sid);
+		if (!ret)
 			goto unlock_out;
 		/* No SID exists for the context.  Allocate a new one. */
-		if (s->next_sid == UINT_MAX || s->shutdown) {
+		if (s->next_sid == (UINT_MAX - SECINITSID_NUM - 1) || s->shutdown) {
 			ret = -ENOMEM;
 			goto unlock_out;
 		}
-		sid = s->next_sid++;
+		*sid = s->next_sid++;
 		if (context->len)
 			pr_info("SELinux:  Context %s is not valid (left unmapped).\n",
 			       context->str);
-		ret = sidtab_insert(s, sid, context);
+		ret = sidtab_insert(s, *sid, context);
 		if (ret)
 			s->next_sid--;
 unlock_out:
 		spin_unlock_irqrestore(&s->lock, flags);
 	}
 
-	if (ret)
-		return ret;
+	return ret;
+}
 
-	*out_sid = sid;
+int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid)
+{
+	int rc;
+	u32 i;
+
+	for (i = 0; i <= SECINITSID_NUM; i++) {
+		struct sidtab_isid_entry *entry = &s->isids[i];
+
+		if (entry->set && context_cmp(context, &entry->context)) {
+			*sid = i;
+			return 0;
+		}
+	}
+
+	rc = sidtab_reverse_lookup(s, context, sid);
+	if (rc)
+		return rc;
+	*sid += SECINITSID_NUM + 1;
 	return 0;
 }
 
@@ -296,6 +334,11 @@ void sidtab_destroy(struct sidtab *s)
 	if (!s)
 		return;
 
+	for (i = 0; i <= SECINITSID_NUM; i++)
+		if (s->isids[i].set)
+			context_destroy(&s->isids[i].context);
+
+
 	for (i = 0; i < SIDTAB_SIZE; i++) {
 		cur = s->htable[i];
 		while (cur) {
@@ -311,18 +354,3 @@ void sidtab_destroy(struct sidtab *s)
 	s->nel = 0;
 	s->next_sid = 1;
 }
-
-void sidtab_set(struct sidtab *dst, struct sidtab *src)
-{
-	unsigned long flags;
-	int i;
-
-	spin_lock_irqsave(&src->lock, flags);
-	dst->htable = src->htable;
-	dst->nel = src->nel;
-	dst->next_sid = src->next_sid;
-	dst->shutdown = 0;
-	for (i = 0; i < SIDTAB_CACHE_LEN; i++)
-		dst->cache[i] = NULL;
-	spin_unlock_irqrestore(&src->lock, flags);
-}
diff --git a/security/selinux/ss/sidtab.h b/security/selinux/ss/sidtab.h
index e1d1f0beb17c..dc0a80bc8894 100644
--- a/security/selinux/ss/sidtab.h
+++ b/security/selinux/ss/sidtab.h
@@ -22,6 +22,11 @@ struct sidtab_node {
 
 #define SIDTAB_SIZE SIDTAB_HASH_BUCKETS
 
+struct sidtab_isid_entry {
+	int set;
+	struct context context;
+};
+
 struct sidtab {
 	struct sidtab_node **htable;
 	unsigned int nel;	/* number of elements */
@@ -30,10 +35,12 @@ struct sidtab {
 #define SIDTAB_CACHE_LEN	3
 	struct sidtab_node *cache[SIDTAB_CACHE_LEN];
 	spinlock_t lock;
+
+	struct sidtab_isid_entry isids[SECINITSID_NUM + 1];
 };
 
 int sidtab_init(struct sidtab *s);
-int sidtab_insert(struct sidtab *s, u32 sid, struct context *context);
+int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context);
 struct context *sidtab_search(struct sidtab *s, u32 sid);
 struct context *sidtab_search_force(struct sidtab *s, u32 sid);
 
@@ -43,13 +50,10 @@ int sidtab_convert(struct sidtab *s, struct sidtab *news,
 				void *args),
 		   void *args);
 
-int sidtab_context_to_sid(struct sidtab *s,
-			  struct context *context,
-			  u32 *sid);
+int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid);
 
 void sidtab_hash_eval(struct sidtab *h, char *tag);
 void sidtab_destroy(struct sidtab *s);
-void sidtab_set(struct sidtab *dst, struct sidtab *src);
 
 #endif	/* _SS_SIDTAB_H_ */
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH v2 2/4] [squash] do not store entry for SECSID_NULL
  2018-11-27 10:36 [RFC PATCH v2 0/4] Fix ENOMEM errors during policy reload Ondrej Mosnacek
  2018-11-27 10:36 ` [RFC PATCH v2 1/4] selinux: use separate table for initial SID lookup Ondrej Mosnacek
@ 2018-11-27 10:36 ` Ondrej Mosnacek
  2018-11-27 17:00   ` Stephen Smalley
  2018-11-27 10:36 ` [RFC PATCH v2 3/4] selinux: overhaul sidtab to fix bug and improve performance Ondrej Mosnacek
  2018-11-27 10:36 ` [RFC PATCH v2 4/4] [squash] add back reverse lookup cache to sidtab Ondrej Mosnacek
  3 siblings, 1 reply; 11+ messages in thread
From: Ondrej Mosnacek @ 2018-11-27 10:36 UTC (permalink / raw)
  To: selinux, Paul Moore; +Cc: Stephen Smalley, Ondrej Mosnacek

This patch is kept separate only for review. Eventually it will be
folded into the previous patch.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
---
 security/selinux/ss/policydb.c |  2 +-
 security/selinux/ss/sidtab.c   | 25 ++++++++++++++++---------
 security/selinux/ss/sidtab.h   |  3 ++-
 3 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/security/selinux/ss/policydb.c b/security/selinux/ss/policydb.c
index 59359fa0bd74..a50d625e7946 100644
--- a/security/selinux/ss/policydb.c
+++ b/security/selinux/ss/policydb.c
@@ -912,7 +912,7 @@ int policydb_load_isids(struct policydb *p, struct sidtab *s)
 			sidtab_destroy(s);
 			goto out;
 		}
-		if (c->sid[0] > SECINITSID_NUM) {
+		if (c->sid[0] == SECSID_NULL || c->sid[0] > SECINITSID_NUM) {
 			pr_err("SELinux:  Initial SID %s out of range.\n",
 				c->u.name);
 			sidtab_destroy(s);
diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
index fd8115b211a6..e157d8240cf1 100644
--- a/security/selinux/ss/sidtab.c
+++ b/security/selinux/ss/sidtab.c
@@ -23,7 +23,7 @@ int sidtab_init(struct sidtab *s)
 	if (!s->htable)
 		return -ENOMEM;
 
-	for (i = 0; i <= SECINITSID_NUM; i++)
+	for (i = 0; i < SECINITSID_NUM; i++)
 		s->isids[i].set = 0;
 
 	for (i = 0; i < SIDTAB_SIZE; i++)
@@ -86,8 +86,15 @@ static int sidtab_insert(struct sidtab *s, u32 sid, struct context *context)
 
 int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context)
 {
-	struct sidtab_isid_entry *entry = &s->isids[sid];
-	int rc = context_cpy(&entry->context, context);
+	struct sidtab_isid_entry *entry;
+	int rc;
+
+	if (sid == 0 || sid > SECINITSID_NUM)
+		return -EINVAL;
+
+	entry = &s->isids[sid - 1];
+
+	rc = context_cpy(&entry->context, context);
 	if (rc)
 		return rc;
 
@@ -116,19 +123,19 @@ static struct context *sidtab_search_core(struct sidtab *s, u32 sid, int force)
 	struct context *context;
 	struct sidtab_isid_entry *entry;
 
-	if (!s)
+	if (!s || sid == 0)
 		return NULL;
 
 	if (sid > SECINITSID_NUM) {
 		context = sidtab_lookup(s, sid - (SECINITSID_NUM + 1));
 	} else {
-		entry = &s->isids[sid];
+		entry = &s->isids[sid - 1];
 		context = entry->set ? &entry->context : NULL;
 	}
 	if (context && (!context->len || force))
 		return context;
 
-	entry = &s->isids[SECINITSID_UNLABELED];
+	entry = &s->isids[SECINITSID_UNLABELED - 1];
 	return entry->set ? &entry->context : NULL;
 }
 
@@ -283,11 +290,11 @@ int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid)
 	int rc;
 	u32 i;
 
-	for (i = 0; i <= SECINITSID_NUM; i++) {
+	for (i = 0; i < SECINITSID_NUM; i++) {
 		struct sidtab_isid_entry *entry = &s->isids[i];
 
 		if (entry->set && context_cmp(context, &entry->context)) {
-			*sid = i;
+			*sid = i + 1;
 			return 0;
 		}
 	}
@@ -334,7 +341,7 @@ void sidtab_destroy(struct sidtab *s)
 	if (!s)
 		return;
 
-	for (i = 0; i <= SECINITSID_NUM; i++)
+	for (i = 0; i < SECINITSID_NUM; i++)
 		if (s->isids[i].set)
 			context_destroy(&s->isids[i].context);
 
diff --git a/security/selinux/ss/sidtab.h b/security/selinux/ss/sidtab.h
index dc0a80bc8894..e657ae6bf996 100644
--- a/security/selinux/ss/sidtab.h
+++ b/security/selinux/ss/sidtab.h
@@ -36,7 +36,8 @@ struct sidtab {
 	struct sidtab_node *cache[SIDTAB_CACHE_LEN];
 	spinlock_t lock;
 
-	struct sidtab_isid_entry isids[SECINITSID_NUM + 1];
+	/* index == SID - 1 (no entry for SECSID_NULL) */
+	struct sidtab_isid_entry isids[SECINITSID_NUM];
 };
 
 int sidtab_init(struct sidtab *s);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH v2 3/4] selinux: overhaul sidtab to fix bug and improve performance
  2018-11-27 10:36 [RFC PATCH v2 0/4] Fix ENOMEM errors during policy reload Ondrej Mosnacek
  2018-11-27 10:36 ` [RFC PATCH v2 1/4] selinux: use separate table for initial SID lookup Ondrej Mosnacek
  2018-11-27 10:36 ` [RFC PATCH v2 2/4] [squash] do not store entry for SECSID_NULL Ondrej Mosnacek
@ 2018-11-27 10:36 ` Ondrej Mosnacek
  2018-11-27 19:41   ` Stephen Smalley
  2018-11-27 10:36 ` [RFC PATCH v2 4/4] [squash] add back reverse lookup cache to sidtab Ondrej Mosnacek
  3 siblings, 1 reply; 11+ messages in thread
From: Ondrej Mosnacek @ 2018-11-27 10:36 UTC (permalink / raw)
  To: selinux, Paul Moore; +Cc: Stephen Smalley, Ondrej Mosnacek

Before this patch, during a policy reload the sidtab would become frozen
and trying to map a new context to SID would be unable to add a new
entry to sidtab and fail with -ENOMEM.

Such failures are usually propagated into userspace, which has no way of
distignuishing them from actual allocation failures and thus doesn't
handle them gracefully. Such situation can be triggered e.g. by the
following reproducer:

    while true; do load_policy; echo -n .; sleep 0.1; done &
    for (( i = 0; i < 1024; i++ )); do
        runcon -l s0:c$i echo -n x || break
        # or:
        # chcon -l s0:c$i <some_file> || break
    done

This patch overhauls the sidtab so it doesn't need to be frozen during
policy reload, thus solving the above problem.

The new SID table leverages the fact that SIDs are allocated
sequentially and are never invalidated and stores them in linear buckets
indexed by a tree structure. This brings several advantages:
  1. Fast SID -> context lookup - this lookup can now be done in
     logarithmic time complexity (usually in less than 4 array lookups)
     and can still be done safely without locking.
  2. No need to re-search the whole table on reverse lookup miss - after
     acquiring the spinlock only the newly added entries need to be
     searched, which means that reverse lookups that end up inserting a
     new entry are now about twice as fast.
  3. No need to freeze sidtab during policy reload - it is now possible
     to handle insertion of new entries even during sidtab conversion.

The tree structure of the new sidtab is able to grow automatically to up
to about 2^31 entries (at which point it should not have more than about
4 tree levels). The old sidtab had a theoretical capacity of almost 2^32
entries, but half of that is still more than enough since by that point
the reverse table lookups would become unusably slow anyway...

The number of entries per tree node is selected automatically so that
each node fits into a single page, which should be the easiest size for
kmalloc() to handle.

Note that the new implementation does not have a cache for recent
reverse lookups as the old one had. Such cache is, however, added in a
subsequent patch.

Tested by selinux-testsuite and thoroughly tortured by this simple
stress test:
```
function rand_cat() {
	echo $(( $RANDOM % 1024 ))
}

function do_work() {
	while true; do
		echo -n "system_u:system_r:kernel_t:s0:c$(rand_cat),c$(rand_cat)" \
			>/sys/fs/selinux/context 2>/dev/null || true
	done
}

do_work >/dev/null &
do_work >/dev/null &
do_work >/dev/null &

while load_policy; do echo -n .; sleep 0.1; done

kill %1
kill %2
kill %3
```

Reported-by: Orion Poplawski <orion@nwra.com>
Reported-by: Li Kun <hw.likun@huawei.com>
Link: https://github.com/SELinuxProject/selinux-kernel/issues/38
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
---
 security/selinux/ss/mls.c      |  23 +-
 security/selinux/ss/mls.h      |   3 +-
 security/selinux/ss/services.c |  90 +++---
 security/selinux/ss/sidtab.c   | 522 +++++++++++++++++++--------------
 security/selinux/ss/sidtab.h   |  75 +++--
 5 files changed, 402 insertions(+), 311 deletions(-)

diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c
index 2fe459df3c85..267ae4f9be79 100644
--- a/security/selinux/ss/mls.c
+++ b/security/selinux/ss/mls.c
@@ -436,16 +436,17 @@ int mls_setup_user_range(struct policydb *p,
 
 /*
  * Convert the MLS fields in the security context
- * structure `c' from the values specified in the
- * policy `oldp' to the values specified in the policy `newp'.
+ * structure `oldc' from the values specified in the
+ * policy `oldp' to the values specified in the policy `newp',
+ * storing the resulting context in `newc'.
  */
 int mls_convert_context(struct policydb *oldp,
 			struct policydb *newp,
-			struct context *c)
+			struct context *oldc,
+			struct context *newc)
 {
 	struct level_datum *levdatum;
 	struct cat_datum *catdatum;
-	struct ebitmap bitmap;
 	struct ebitmap_node *node;
 	int l, i;
 
@@ -455,28 +456,24 @@ int mls_convert_context(struct policydb *oldp,
 	for (l = 0; l < 2; l++) {
 		levdatum = hashtab_search(newp->p_levels.table,
 					  sym_name(oldp, SYM_LEVELS,
-						   c->range.level[l].sens - 1));
+						   oldc->range.level[l].sens - 1));
 
 		if (!levdatum)
 			return -EINVAL;
-		c->range.level[l].sens = levdatum->level->sens;
+		newc->range.level[l].sens = levdatum->level->sens;
 
-		ebitmap_init(&bitmap);
-		ebitmap_for_each_positive_bit(&c->range.level[l].cat, node, i) {
+		ebitmap_for_each_positive_bit(&oldc->range.level[l].cat, node, i) {
 			int rc;
 
 			catdatum = hashtab_search(newp->p_cats.table,
 						  sym_name(oldp, SYM_CATS, i));
 			if (!catdatum)
 				return -EINVAL;
-			rc = ebitmap_set_bit(&bitmap, catdatum->value - 1, 1);
+			rc = ebitmap_set_bit(&newc->range.level[l].cat,
+					     catdatum->value - 1, 1);
 			if (rc)
 				return rc;
-
-			cond_resched();
 		}
-		ebitmap_destroy(&c->range.level[l].cat);
-		c->range.level[l].cat = bitmap;
 	}
 
 	return 0;
diff --git a/security/selinux/ss/mls.h b/security/selinux/ss/mls.h
index 67093647576d..7954b1e60b64 100644
--- a/security/selinux/ss/mls.h
+++ b/security/selinux/ss/mls.h
@@ -46,7 +46,8 @@ int mls_range_set(struct context *context, struct mls_range *range);
 
 int mls_convert_context(struct policydb *oldp,
 			struct policydb *newp,
-			struct context *context);
+			struct context *oldc,
+			struct context *newc);
 
 int mls_compute_sid(struct policydb *p,
 		    struct context *scontext,
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
index 30170d4c567a..3c5887838e12 100644
--- a/security/selinux/ss/services.c
+++ b/security/selinux/ss/services.c
@@ -1907,19 +1907,16 @@ struct convert_context_args {
 
 /*
  * Convert the values in the security context
- * structure `c' from the values specified
+ * structure `oldc' from the values specified
  * in the policy `p->oldp' to the values specified
- * in the policy `p->newp'.  Verify that the
- * context is valid under the new policy.
+ * in the policy `p->newp', storing the new context
+ * in `newc'.  Verify that the context is valid
+ * under the new policy.
  */
-static int convert_context(u32 key,
-			   struct context *c,
-			   void *p)
+static int convert_context(struct context *oldc, struct context *newc, void *p)
 {
 	struct convert_context_args *args;
-	struct context oldc;
 	struct ocontext *oc;
-	struct mls_range *range;
 	struct role_datum *role;
 	struct type_datum *typdatum;
 	struct user_datum *usrdatum;
@@ -1929,23 +1926,18 @@ static int convert_context(u32 key,
 
 	args = p;
 
-	if (c->str) {
-		struct context ctx;
-
+	if (oldc->str) {
 		rc = -ENOMEM;
-		s = kstrdup(c->str, GFP_KERNEL);
+		s = kstrdup(oldc->str, GFP_KERNEL);
 		if (!s)
 			goto out;
 
 		rc = string_to_context_struct(args->newp, NULL, s,
-					      &ctx, SECSID_NULL);
+					      newc, SECSID_NULL);
 		kfree(s);
 		if (!rc) {
 			pr_info("SELinux:  Context %s became valid (mapped).\n",
-			       c->str);
-			/* Replace string with mapped representation. */
-			kfree(c->str);
-			memcpy(c, &ctx, sizeof(*c));
+				oldc->str);
 			goto out;
 		} else if (rc == -EINVAL) {
 			/* Retain string representation for later mapping. */
@@ -1954,51 +1946,42 @@ static int convert_context(u32 key,
 		} else {
 			/* Other error condition, e.g. ENOMEM. */
 			pr_err("SELinux:   Unable to map context %s, rc = %d.\n",
-			       c->str, -rc);
+			       oldc->str, -rc);
 			goto out;
 		}
 	}
 
-	rc = context_cpy(&oldc, c);
-	if (rc)
-		goto out;
+	context_init(newc);
 
 	/* Convert the user. */
 	rc = -EINVAL;
 	usrdatum = hashtab_search(args->newp->p_users.table,
-				  sym_name(args->oldp, SYM_USERS, c->user - 1));
+				  sym_name(args->oldp, SYM_USERS, oldc->user - 1));
 	if (!usrdatum)
 		goto bad;
-	c->user = usrdatum->value;
+	newc->user = usrdatum->value;
 
 	/* Convert the role. */
 	rc = -EINVAL;
 	role = hashtab_search(args->newp->p_roles.table,
-			      sym_name(args->oldp, SYM_ROLES, c->role - 1));
+			      sym_name(args->oldp, SYM_ROLES, oldc->role - 1));
 	if (!role)
 		goto bad;
-	c->role = role->value;
+	newc->role = role->value;
 
 	/* Convert the type. */
 	rc = -EINVAL;
 	typdatum = hashtab_search(args->newp->p_types.table,
-				  sym_name(args->oldp, SYM_TYPES, c->type - 1));
+				  sym_name(args->oldp, SYM_TYPES, oldc->type - 1));
 	if (!typdatum)
 		goto bad;
-	c->type = typdatum->value;
+	newc->type = typdatum->value;
 
 	/* Convert the MLS fields if dealing with MLS policies */
 	if (args->oldp->mls_enabled && args->newp->mls_enabled) {
-		rc = mls_convert_context(args->oldp, args->newp, c);
+		rc = mls_convert_context(args->oldp, args->newp, oldc, newc);
 		if (rc)
 			goto bad;
-	} else if (args->oldp->mls_enabled && !args->newp->mls_enabled) {
-		/*
-		 * Switching between MLS and non-MLS policy:
-		 * free any storage used by the MLS fields in the
-		 * context for all existing entries in the sidtab.
-		 */
-		mls_context_destroy(c);
 	} else if (!args->oldp->mls_enabled && args->newp->mls_enabled) {
 		/*
 		 * Switching between non-MLS and MLS policy:
@@ -2016,36 +1999,31 @@ static int convert_context(u32 key,
 				" the initial SIDs list\n");
 			goto bad;
 		}
-		range = &oc->context[0].range;
-		rc = mls_range_set(c, range);
+		rc = mls_range_set(newc, &oc->context[0].range);
 		if (rc)
 			goto bad;
 	}
 
 	/* Check the validity of the new context. */
-	if (!policydb_context_isvalid(args->newp, c)) {
-		rc = convert_context_handle_invalid_context(args->state,
-							    &oldc);
+	if (!policydb_context_isvalid(args->newp, newc)) {
+		rc = convert_context_handle_invalid_context(args->state, oldc);
 		if (rc)
 			goto bad;
 	}
 
-	context_destroy(&oldc);
-
 	rc = 0;
 out:
 	return rc;
 bad:
 	/* Map old representation to string and save it. */
-	rc = context_struct_to_string(args->oldp, &oldc, &s, &len);
+	rc = context_struct_to_string(args->oldp, oldc, &s, &len);
 	if (rc)
 		return rc;
-	context_destroy(&oldc);
-	context_destroy(c);
-	c->str = s;
-	c->len = len;
+	context_destroy(newc);
+	newc->str = s;
+	newc->len = len;
 	pr_info("SELinux:  Context %s became invalid (unmapped).\n",
-	       c->str);
+		newc->str);
 	rc = 0;
 	goto out;
 }
@@ -2091,6 +2069,7 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
 	struct policydb *oldpolicydb, *newpolicydb;
 	struct selinux_mapping *oldmapping;
 	struct selinux_map newmap;
+	struct sidtab_convert_params convert_params;
 	struct convert_context_args args;
 	u32 seqno;
 	int rc = 0;
@@ -2147,12 +2126,6 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
 		goto out;
 	}
 
-	oldsidtab = state->ss->sidtab;
-
-#if 0
-	sidtab_hash_eval(oldsidtab, "sids");
-#endif
-
 	rc = policydb_read(newpolicydb, fp);
 	if (rc) {
 		kfree(newsidtab);
@@ -2184,6 +2157,8 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
 		goto err;
 	}
 
+	oldsidtab = state->ss->sidtab;
+
 	/*
 	 * Convert the internal representations of contexts
 	 * in the new SID table.
@@ -2191,7 +2166,12 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
 	args.state = state;
 	args.oldp = policydb;
 	args.newp = newpolicydb;
-	rc = sidtab_convert(oldsidtab, newsidtab, convert_context, &args);
+
+	convert_params.func = convert_context;
+	convert_params.args = &args;
+	convert_params.target = newsidtab;
+
+	rc = sidtab_convert(oldsidtab, &convert_params);
 	if (rc) {
 		pr_err("SELinux:  unable to convert the internal"
 			" representation of contexts in the new SID"
diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
index e157d8240cf1..a82deecfac09 100644
--- a/security/selinux/ss/sidtab.c
+++ b/security/selinux/ss/sidtab.c
@@ -2,88 +2,38 @@
 /*
  * Implementation of the SID table type.
  *
- * Author : Stephen Smalley, <sds@tycho.nsa.gov>
+ * Original author: Stephen Smalley, <sds@tycho.nsa.gov>
+ * Author: Ondrej Mosnacek, <omosnacek@gmail.com>
+ *
+ * Copyright (C) 2018 Red Hat, Inc.
  */
+#include <linux/errno.h>
 #include <linux/kernel.h>
 #include <linux/slab.h>
+#include <linux/sched.h>
 #include <linux/spinlock.h>
-#include <linux/errno.h>
+#include <linux/atomic.h>
 #include "flask.h"
 #include "security.h"
 #include "sidtab.h"
 
-#define SIDTAB_HASH(sid) \
-(sid & SIDTAB_HASH_MASK)
-
 int sidtab_init(struct sidtab *s)
 {
-	int i;
+	u32 i;
 
-	s->htable = kmalloc_array(SIDTAB_SIZE, sizeof(*s->htable), GFP_ATOMIC);
-	if (!s->htable)
-		return -ENOMEM;
+	memset(s->roots, 0, sizeof(s->roots));
 
 	for (i = 0; i < SECINITSID_NUM; i++)
 		s->isids[i].set = 0;
 
-	for (i = 0; i < SIDTAB_SIZE; i++)
-		s->htable[i] = NULL;
+	atomic_set(&s->count, 0);
 
-	for (i = 0; i < SIDTAB_CACHE_LEN; i++)
-		s->cache[i] = NULL;
+	s->convert = NULL;
 
-	s->nel = 0;
-	s->next_sid = 0;
-	s->shutdown = 0;
 	spin_lock_init(&s->lock);
 	return 0;
 }
 
-static int sidtab_insert(struct sidtab *s, u32 sid, struct context *context)
-{
-	int hvalue;
-	struct sidtab_node *prev, *cur, *newnode;
-
-	if (!s)
-		return -ENOMEM;
-
-	hvalue = SIDTAB_HASH(sid);
-	prev = NULL;
-	cur = s->htable[hvalue];
-	while (cur && sid > cur->sid) {
-		prev = cur;
-		cur = cur->next;
-	}
-
-	if (cur && sid == cur->sid)
-		return -EEXIST;
-
-	newnode = kmalloc(sizeof(*newnode), GFP_ATOMIC);
-	if (!newnode)
-		return -ENOMEM;
-
-	newnode->sid = sid;
-	if (context_cpy(&newnode->context, context)) {
-		kfree(newnode);
-		return -ENOMEM;
-	}
-
-	if (prev) {
-		newnode->next = prev->next;
-		wmb();
-		prev->next = newnode;
-	} else {
-		newnode->next = s->htable[hvalue];
-		wmb();
-		s->htable[hvalue] = newnode;
-	}
-
-	s->nel++;
-	if (sid >= s->next_sid)
-		s->next_sid = sid + 1;
-	return 0;
-}
-
 int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context)
 {
 	struct sidtab_isid_entry *entry;
@@ -102,20 +52,90 @@ int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context)
 	return 0;
 }
 
-static struct context *sidtab_lookup(struct sidtab *s, u32 sid)
+static u32 sidtab_level_from_count(u32 count)
 {
-	int hvalue;
-	struct sidtab_node *cur;
+	u32 capacity = SIDTAB_LEAF_ENTRIES;
+	u32 level = 0;
 
-	hvalue = SIDTAB_HASH(sid);
-	cur = s->htable[hvalue];
-	while (cur && sid > cur->sid)
-		cur = cur->next;
+	while (count > capacity) {
+		capacity <<= SIDTAB_INNER_SHIFT;
+		++level;
+	}
+	return level;
+}
+
+static int sidtab_alloc_roots(struct sidtab *s, u32 level)
+{
+	u32 l;
+
+	if (!s->roots[0].ptr_leaf) {
+		s->roots[0].ptr_leaf = kzalloc(SIDTAB_NODE_ALLOC_SIZE,
+					       GFP_ATOMIC);
+		if (!s->roots[0].ptr_leaf)
+			return -ENOMEM;
+	}
+	for (l = 1; l <= level; ++l)
+		if (!s->roots[l].ptr_inner) {
+			s->roots[l].ptr_inner = kzalloc(SIDTAB_NODE_ALLOC_SIZE,
+							GFP_ATOMIC);
+			if (!s->roots[l].ptr_inner)
+				return -ENOMEM;
+			s->roots[l].ptr_inner->entries[0] = s->roots[l - 1];
+		}
+	return 0;
+}
 
-	if (!cur || sid != cur->sid)
+static struct context *sidtab_do_lookup(struct sidtab *s, u32 index, int alloc)
+{
+	union sidtab_entry_inner *entry;
+	u32 level, capacity_shift, leaf_index = index / SIDTAB_LEAF_ENTRIES;
+
+	/* find the level of the subtree we need */
+	level = sidtab_level_from_count(index + 1);
+	capacity_shift = level * SIDTAB_INNER_SHIFT;
+
+	/* allocate roots if needed */
+	if (alloc && sidtab_alloc_roots(s, level) != 0)
 		return NULL;
 
-	return &cur->context;
+	/* lookup inside the subtree */
+	entry = &s->roots[level];
+	while (level != 0) {
+		capacity_shift -= SIDTAB_INNER_SHIFT;
+		--level;
+
+		entry = &entry->ptr_inner->entries[leaf_index >> capacity_shift];
+		leaf_index &= ((u32)1 << capacity_shift) - 1;
+
+		if (!entry->ptr_inner) {
+			if (alloc)
+				entry->ptr_inner = kzalloc(SIDTAB_NODE_ALLOC_SIZE,
+							   GFP_ATOMIC);
+			if (!entry->ptr_inner)
+				return NULL;
+		}
+	}
+	if (!entry->ptr_leaf) {
+		if (alloc)
+			entry->ptr_leaf = kzalloc(SIDTAB_NODE_ALLOC_SIZE,
+						  GFP_ATOMIC);
+		if (!entry->ptr_leaf)
+			return NULL;
+	}
+	return &entry->ptr_leaf->entries[index % SIDTAB_LEAF_ENTRIES].context;
+}
+
+static struct context *sidtab_lookup(struct sidtab *s, u32 index)
+{
+	u32 count = (u32)atomic_read(&s->count);
+
+	if (index >= count)
+		return NULL;
+
+	/* read entries after reading count */
+	smp_rmb();
+
+	return sidtab_do_lookup(s, index, 0);
 }
 
 static struct context *sidtab_search_core(struct sidtab *s, u32 sid, int force)
@@ -123,7 +143,7 @@ static struct context *sidtab_search_core(struct sidtab *s, u32 sid, int force)
 	struct context *context;
 	struct sidtab_isid_entry *entry;
 
-	if (!s || sid == 0)
+	if (sid == 0)
 		return NULL;
 
 	if (sid > SECINITSID_NUM) {
@@ -149,140 +169,126 @@ struct context *sidtab_search_force(struct sidtab *s, u32 sid)
 	return sidtab_search_core(s, sid, 1);
 }
 
-static int sidtab_map(struct sidtab *s,
-		      int (*apply)(u32 sid,
-				   struct context *context,
-				   void *args),
-		      void *args)
+static int sidtab_find_context(union sidtab_entry_inner entry,
+			       u32 *pos, u32 count, u32 level,
+			       struct context *context, u32 *index)
 {
-	int i, rc = 0;
-	struct sidtab_node *cur;
+	int rc;
+	u32 i;
 
-	if (!s)
-		goto out;
+	if (level != 0) {
+		struct sidtab_node_inner *node = entry.ptr_inner;
 
-	for (i = 0; i < SIDTAB_SIZE; i++) {
-		cur = s->htable[i];
-		while (cur) {
-			rc = apply(cur->sid, &cur->context, args);
-			if (rc)
-				goto out;
-			cur = cur->next;
+		i = 0;
+		while (i < SIDTAB_INNER_ENTRIES && *pos < count) {
+			rc = sidtab_find_context(node->entries[i],
+						 pos, count, level - 1,
+						 context, index);
+			if (rc == 0)
+				return 0;
+			i++;
 		}
-	}
-out:
-	return rc;
-}
+	} else {
+		struct sidtab_node_leaf *node = entry.ptr_leaf;
 
-/* Clone the SID into the new SID table. */
-static int clone_sid(u32 sid, struct context *context, void *arg)
-{
-	struct sidtab *s = arg;
-	return sidtab_insert(s, sid, context);
+		i = 0;
+		while (i < SIDTAB_LEAF_ENTRIES && *pos < count) {
+			if (context_cmp(&node->entries[i].context, context)) {
+				*index = *pos;
+				return 0;
+			}
+			(*pos)++;
+			i++;
+		}
+	}
+	return -ENOENT;
 }
 
-int sidtab_convert(struct sidtab *s, struct sidtab *news,
-		   int (*convert)(u32 sid,
-				  struct context *context,
-				  void *args),
-		   void *args)
+static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
+				 u32 *index)
 {
 	unsigned long flags;
+	u32 count = (u32)atomic_read(&s->count);
+	u32 count_locked, level, pos;
+	struct sidtab_convert_params *convert;
+	struct context *dst, *dst_convert;
 	int rc;
 
-	spin_lock_irqsave(&s->lock, flags);
-	s->shutdown = 1;
-	spin_unlock_irqrestore(&s->lock, flags);
+	level = sidtab_level_from_count(count);
 
-	rc = sidtab_map(s, clone_sid, news);
-	if (rc)
-		return rc;
+	/* read entries after reading count */
+	smp_rmb();
 
-	return sidtab_map(news, convert, args);
-}
+	pos = 0;
+	rc = sidtab_find_context(s->roots[level], &pos, count, level,
+				 context, index);
+	if (rc == 0)
+		return 0;
 
-static void sidtab_update_cache(struct sidtab *s, struct sidtab_node *n, int loc)
-{
-	BUG_ON(loc >= SIDTAB_CACHE_LEN);
+	/* lock-free search failed: lock, re-search, and insert if not found */
+	spin_lock_irqsave(&s->lock, flags);
 
-	while (loc > 0) {
-		s->cache[loc] = s->cache[loc - 1];
-		loc--;
-	}
-	s->cache[0] = n;
-}
+	convert = s->convert;
+	count_locked = (u32)atomic_read(&s->count);
+	level = sidtab_level_from_count(count_locked);
 
-static inline int sidtab_search_context(struct sidtab *s,
-					struct context *context, u32 *sid)
-{
-	int i;
-	struct sidtab_node *cur;
-
-	for (i = 0; i < SIDTAB_SIZE; i++) {
-		cur = s->htable[i];
-		while (cur) {
-			if (context_cmp(&cur->context, context)) {
-				sidtab_update_cache(s, cur, SIDTAB_CACHE_LEN - 1);
-				*sid = cur->sid;
-				return 0;
-			}
-			cur = cur->next;
+	/* if count has changed before we acquired the lock, then catch up */
+	while (count < count_locked) {
+		if (context_cmp(sidtab_do_lookup(s, count, 0), context)) {
+			rc = 0;
+			*index = count;
+			goto out_unlock;
 		}
+		++count;
 	}
-	return -ENOENT;
-}
 
-static inline int sidtab_search_cache(struct sidtab *s, struct context *context,
-				      u32 *sid)
-{
-	int i;
-	struct sidtab_node *node;
-
-	for (i = 0; i < SIDTAB_CACHE_LEN; i++) {
-		node = s->cache[i];
-		if (unlikely(!node))
-			return -ENOENT;
-		if (context_cmp(&node->context, context)) {
-			sidtab_update_cache(s, node, i);
-			*sid = node->sid;
-			return 0;
-		}
-	}
-	return -ENOENT;
-}
+	/* insert context into new entry */
+	rc = -ENOMEM;
+	dst = sidtab_do_lookup(s, count, 1);
+	if (!dst)
+		goto out_unlock;
 
-static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
-				 u32 *sid)
-{
-	int ret;
-	unsigned long flags;
+	rc = context_cpy(dst, context);
+	if (rc)
+		goto out_unlock;
+
+	/*
+	 * if we are building a new sidtab, we need to convert the context
+	 * and insert it there as well
+	 */
+	if (convert) {
+		rc = -ENOMEM;
+		dst_convert = sidtab_do_lookup(convert->target, count, 1);
+		if (!dst_convert) {
+			context_destroy(dst);
+			goto out_unlock;
+		}
 
-	ret = sidtab_search_cache(s, context, sid);
-	if (ret)
-		ret = sidtab_search_context(s, context, sid);
-	if (ret) {
-		spin_lock_irqsave(&s->lock, flags);
-		/* Rescan now that we hold the lock. */
-		ret = sidtab_search_context(s, context, sid);
-		if (!ret)
-			goto unlock_out;
-		/* No SID exists for the context.  Allocate a new one. */
-		if (s->next_sid == (UINT_MAX - SECINITSID_NUM - 1) || s->shutdown) {
-			ret = -ENOMEM;
-			goto unlock_out;
+		rc = convert->func(context, dst_convert, convert->args);
+		if (rc) {
+			context_destroy(dst);
+			goto out_unlock;
 		}
-		*sid = s->next_sid++;
-		if (context->len)
-			pr_info("SELinux:  Context %s is not valid (left unmapped).\n",
-			       context->str);
-		ret = sidtab_insert(s, *sid, context);
-		if (ret)
-			s->next_sid--;
-unlock_out:
-		spin_unlock_irqrestore(&s->lock, flags);
+
+		/* at this point we know the insert won't fail */
+		atomic_set(&convert->target->count, count + 1);
 	}
 
-	return ret;
+	if (context->len)
+		pr_info("SELinux:  Context %s is not valid (left unmapped).\n",
+			context->str);
+
+	*index = count;
+
+	/* write entries before writing new count */
+	smp_wmb();
+
+	atomic_set(&s->count, count + 1);
+
+	rc = 0;
+out_unlock:
+	spin_unlock_irqrestore(&s->lock, flags);
+	return rc;
 }
 
 int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid)
@@ -306,58 +312,134 @@ int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid)
 	return 0;
 }
 
-void sidtab_hash_eval(struct sidtab *h, char *tag)
+static int sidtab_convert_tree(union sidtab_entry_inner *edst,
+			       union sidtab_entry_inner *esrc,
+			       u32 *pos, u32 count, u32 level,
+			       struct sidtab_convert_params *convert)
 {
-	int i, chain_len, slots_used, max_chain_len;
-	struct sidtab_node *cur;
-
-	slots_used = 0;
-	max_chain_len = 0;
-	for (i = 0; i < SIDTAB_SIZE; i++) {
-		cur = h->htable[i];
-		if (cur) {
-			slots_used++;
-			chain_len = 0;
-			while (cur) {
-				chain_len++;
-				cur = cur->next;
-			}
+	int rc;
+	u32 i;
 
-			if (chain_len > max_chain_len)
-				max_chain_len = chain_len;
+	if (level != 0) {
+		if (!edst->ptr_inner) {
+			edst->ptr_inner = kzalloc(SIDTAB_NODE_ALLOC_SIZE, GFP_KERNEL);
+			if (!edst->ptr_inner)
+				return -ENOMEM;
+		}
+		i = 0;
+		while (i < SIDTAB_INNER_ENTRIES && *pos < count) {
+			rc = sidtab_convert_tree(&edst->ptr_inner->entries[i],
+						 &esrc->ptr_inner->entries[i],
+						 pos, count, level - 1, convert);
+			if (rc)
+				return rc;
+			i++;
 		}
+	} else {
+		if (!edst->ptr_leaf) {
+			edst->ptr_leaf = kzalloc(SIDTAB_NODE_ALLOC_SIZE, GFP_KERNEL);
+			if (!edst->ptr_leaf)
+				return -ENOMEM;
+		}
+		i = 0;
+		while (i < SIDTAB_LEAF_ENTRIES && *pos < count) {
+			rc = convert->func(&esrc->ptr_leaf->entries[i].context,
+					   &edst->ptr_leaf->entries[i].context,
+					   convert->args);
+			if (rc)
+				return rc;
+			(*pos)++;
+			i++;
+		}
+		cond_resched();
+	}
+	return 0;
+}
+
+int sidtab_convert(struct sidtab *s, struct sidtab_convert_params *params)
+{
+	unsigned long flags;
+	u32 count, level, pos;
+	int rc;
+
+	spin_lock_irqsave(&s->lock, flags);
+
+	/* concurrent policy loads are not allowed */
+	if (s->convert) {
+		spin_unlock_irqrestore(&s->lock, flags);
+		return -EBUSY;
+	}
+
+	count = (u32)atomic_read(&s->count);
+	level = sidtab_level_from_count(count);
+
+	/* allocate last leaf in the new sidtab (to avoid race with live convert) */
+	rc = sidtab_do_lookup(params->target, count - 1, 1) ? 0 : -ENOMEM;
+	if (rc) {
+		spin_unlock_irqrestore(&s->lock, flags);
+		return rc;
+	}
+
+	/* set count in case no new entries are added during conversion */
+	atomic_set(&params->target->count, count);
+
+	/* enable live convert of new entries */
+	s->convert = params;
+
+	/* we can safely do the rest of the conversion outside the lock */
+	spin_unlock_irqrestore(&s->lock, flags);
+
+	pr_info("SELinux:  Converting %u SID table entries...\n", count);
+
+	/* convert all entries not covered by live convert */
+	pos = 0;
+	rc = sidtab_convert_tree(&params->target->roots[level], &s->roots[level],
+				 &pos, count, level, params);
+	if (rc) {
+		/* we need to keep the old table - disable live convert */
+		spin_lock_irqsave(&s->lock, flags);
+		s->convert = NULL;
+		spin_unlock_irqrestore(&s->lock, flags);
 	}
+	return rc;
+}
+
+static void sidtab_destroy_tree(union sidtab_entry_inner entry, u32 level)
+{
+	u32 i;
+
+	if (level != 0) {
+		struct sidtab_node_inner *node = entry.ptr_inner;
 
-	pr_debug("%s:  %d entries and %d/%d buckets used, longest "
-	       "chain length %d\n", tag, h->nel, slots_used, SIDTAB_SIZE,
-	       max_chain_len);
+		if (!node)
+			return;
+
+		for (i = 0; i < SIDTAB_INNER_ENTRIES; i++)
+			sidtab_destroy_tree(node->entries[i], level - 1);
+		kfree(node);
+	} else {
+		struct sidtab_node_leaf *node = entry.ptr_leaf;
+
+		if (!node)
+			return;
+
+		for (i = 0; i < SIDTAB_LEAF_ENTRIES; i++)
+			context_destroy(&node->entries[i].context);
+		kfree(node);
+	}
 }
 
 void sidtab_destroy(struct sidtab *s)
 {
-	int i;
-	struct sidtab_node *cur, *temp;
-
-	if (!s)
-		return;
+	u32 i, level;
 
 	for (i = 0; i < SECINITSID_NUM; i++)
 		if (s->isids[i].set)
 			context_destroy(&s->isids[i].context);
 
+	level = SIDTAB_MAX_LEVEL;
+	while (level && !s->roots[level].ptr_inner)
+		--level;
 
-	for (i = 0; i < SIDTAB_SIZE; i++) {
-		cur = s->htable[i];
-		while (cur) {
-			temp = cur;
-			cur = cur->next;
-			context_destroy(&temp->context);
-			kfree(temp);
-		}
-		s->htable[i] = NULL;
-	}
-	kfree(s->htable);
-	s->htable = NULL;
-	s->nel = 0;
-	s->next_sid = 1;
+	sidtab_destroy_tree(s->roots[level], level);
 }
diff --git a/security/selinux/ss/sidtab.h b/security/selinux/ss/sidtab.h
index e657ae6bf996..292512792a70 100644
--- a/security/selinux/ss/sidtab.h
+++ b/security/selinux/ss/sidtab.h
@@ -1,39 +1,75 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 /*
- * A security identifier table (sidtab) is a hash table
+ * A security identifier table (sidtab) is a lookup table
  * of security context structures indexed by SID value.
  *
- * Author : Stephen Smalley, <sds@tycho.nsa.gov>
+ * Original author: Stephen Smalley, <sds@tycho.nsa.gov>
+ * Author: Ondrej Mosnacek, <omosnacek@gmail.com>
+ *
+ * Copyright (C) 2018 Red Hat, Inc.
  */
 #ifndef _SS_SIDTAB_H_
 #define _SS_SIDTAB_H_
 
+#include <linux/spinlock_types.h>
+#include <linux/log2.h>
+
 #include "context.h"
 
-struct sidtab_node {
-	u32 sid;		/* security identifier */
-	struct context context;	/* security context structure */
-	struct sidtab_node *next;
+struct sidtab_entry_leaf {
+	struct context context;
+};
+
+struct sidtab_node_inner;
+struct sidtab_node_leaf;
+
+union sidtab_entry_inner {
+	struct sidtab_node_inner *ptr_inner;
+	struct sidtab_node_leaf  *ptr_leaf;
 };
 
-#define SIDTAB_HASH_BITS 7
-#define SIDTAB_HASH_BUCKETS (1 << SIDTAB_HASH_BITS)
-#define SIDTAB_HASH_MASK (SIDTAB_HASH_BUCKETS-1)
+/* align node size to page boundary */
+#define SIDTAB_NODE_ALLOC_SHIFT PAGE_SHIFT
+#define SIDTAB_NODE_ALLOC_SIZE  PAGE_SIZE
+
+#define size_to_shift(size) ((size) == 1 ? 1 : (const_ilog2((size) - 1) + 1))
+
+#define SIDTAB_INNER_SHIFT \
+	(SIDTAB_NODE_ALLOC_SHIFT - size_to_shift(sizeof(union sidtab_entry_inner)))
 
-#define SIDTAB_SIZE SIDTAB_HASH_BUCKETS
+#define SIDTAB_LEAF_ENTRIES  (PAGE_SIZE / sizeof(struct sidtab_entry_leaf))
+#define SIDTAB_INNER_ENTRIES ((size_t)1 << SIDTAB_INNER_SHIFT)
+
+#define SIDTAB_MAX_BITS 31 /* limited to INT_MAX due to atomic_t range */
+#define SIDTAB_MAX (((u32)1 << SIDTAB_MAX_BITS) - 1)
+/* ensure enough tree levels for SIDTAB_MAX entries */
+#define SIDTAB_MAX_LEVEL \
+	DIV_ROUND_UP(SIDTAB_MAX_BITS - size_to_shift(SIDTAB_LEAF_ENTRIES), \
+		     SIDTAB_INNER_SHIFT)
+
+struct sidtab_node_leaf {
+	struct sidtab_entry_leaf entries[SIDTAB_LEAF_ENTRIES];
+};
+
+struct sidtab_node_inner {
+	union sidtab_entry_inner entries[SIDTAB_INNER_ENTRIES];
+};
 
 struct sidtab_isid_entry {
 	int set;
 	struct context context;
 };
 
+struct sidtab_convert_params {
+	int (*func)(struct context *oldc, struct context *newc, void *args);
+	void *args;
+	struct sidtab *target;
+};
+
 struct sidtab {
-	struct sidtab_node **htable;
-	unsigned int nel;	/* number of elements */
-	unsigned int next_sid;	/* next SID to allocate */
-	unsigned char shutdown;
-#define SIDTAB_CACHE_LEN	3
-	struct sidtab_node *cache[SIDTAB_CACHE_LEN];
+	union sidtab_entry_inner roots[SIDTAB_MAX_LEVEL + 1];
+	atomic_t count;
+	struct sidtab_convert_params *convert;
 	spinlock_t lock;
 
 	/* index == SID - 1 (no entry for SECSID_NULL) */
@@ -45,15 +81,10 @@ int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context);
 struct context *sidtab_search(struct sidtab *s, u32 sid);
 struct context *sidtab_search_force(struct sidtab *s, u32 sid);
 
-int sidtab_convert(struct sidtab *s, struct sidtab *news,
-		   int (*apply)(u32 sid,
-				struct context *context,
-				void *args),
-		   void *args);
+int sidtab_convert(struct sidtab *s, struct sidtab_convert_params *params);
 
 int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid);
 
-void sidtab_hash_eval(struct sidtab *h, char *tag);
 void sidtab_destroy(struct sidtab *s);
 
 #endif	/* _SS_SIDTAB_H_ */
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH v2 4/4] [squash] add back reverse lookup cache to sidtab
  2018-11-27 10:36 [RFC PATCH v2 0/4] Fix ENOMEM errors during policy reload Ondrej Mosnacek
                   ` (2 preceding siblings ...)
  2018-11-27 10:36 ` [RFC PATCH v2 3/4] selinux: overhaul sidtab to fix bug and improve performance Ondrej Mosnacek
@ 2018-11-27 10:36 ` Ondrej Mosnacek
  3 siblings, 0 replies; 11+ messages in thread
From: Ondrej Mosnacek @ 2018-11-27 10:36 UTC (permalink / raw)
  To: selinux, Paul Moore; +Cc: Stephen Smalley, Ondrej Mosnacek

This patch adds a simple cache for reverse lookup to the new sidtab
implementation. The cache works in a very similar way as the old
implementation's cache. The only difference is that instead of storing
pointers to the hash table nodes it stores just the indices of the
'cached' entries.

The new cache ensures that the indices are loaded/stored atomically, but
it still has the drawback that concurrent cache updates may mess up the
contents of the cache. Such situation however only reduces its
effectivity, not the correctness of lookups.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
---
 security/selinux/ss/sidtab.c | 49 ++++++++++++++++++++++++++++++++++--
 security/selinux/ss/sidtab.h |  5 ++++
 2 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
index a82deecfac09..abd2beb6dafc 100644
--- a/security/selinux/ss/sidtab.c
+++ b/security/selinux/ss/sidtab.c
@@ -23,6 +23,9 @@ int sidtab_init(struct sidtab *s)
 
 	memset(s->roots, 0, sizeof(s->roots));
 
+	for (i = 0; i < SIDTAB_RCACHE_SIZE; i++)
+		atomic_set(&s->rcache[i], -1);
+
 	for (i = 0; i < SECINITSID_NUM; i++)
 		s->isids[i].set = 0;
 
@@ -204,6 +207,40 @@ static int sidtab_find_context(union sidtab_entry_inner entry,
 	return -ENOENT;
 }
 
+static void sidtab_rcache_update(struct sidtab *s, u32 index, u32 pos)
+{
+	while (pos > 0) {
+		atomic_set(&s->rcache[pos], atomic_read(&s->rcache[pos - 1]));
+		--pos;
+	}
+	atomic_set(&s->rcache[0], (int)index);
+}
+
+static void sidtab_rcache_push(struct sidtab *s, u32 index)
+{
+	sidtab_rcache_update(s, index, SIDTAB_RCACHE_SIZE - 1);
+}
+
+static int sidtab_rcache_search(struct sidtab *s, struct context *context,
+				u32 *index)
+{
+	u32 i;
+
+	for (i = 0; i < SIDTAB_RCACHE_SIZE; i++) {
+		int v = atomic_read(&s->rcache[i]);
+
+		if (v < 0)
+			continue;
+
+		if (context_cmp(sidtab_do_lookup(s, (u32)v, 0), context)) {
+			sidtab_rcache_update(s, (u32)v, i);
+			*index = (u32)v;
+			return 0;
+		}
+	}
+	return -ENOENT;
+}
+
 static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
 				 u32 *index)
 {
@@ -214,6 +251,10 @@ static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
 	struct context *dst, *dst_convert;
 	int rc;
 
+	rc = sidtab_rcache_search(s, context, index);
+	if (rc == 0)
+		return 0;
+
 	level = sidtab_level_from_count(count);
 
 	/* read entries after reading count */
@@ -222,8 +263,10 @@ static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
 	pos = 0;
 	rc = sidtab_find_context(s->roots[level], &pos, count, level,
 				 context, index);
-	if (rc == 0)
+	if (rc == 0) {
+		sidtab_rcache_push(s, *index);
 		return 0;
+	}
 
 	/* lock-free search failed: lock, re-search, and insert if not found */
 	spin_lock_irqsave(&s->lock, flags);
@@ -235,8 +278,9 @@ static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
 	/* if count has changed before we acquired the lock, then catch up */
 	while (count < count_locked) {
 		if (context_cmp(sidtab_do_lookup(s, count, 0), context)) {
-			rc = 0;
+			sidtab_rcache_push(s, count);
 			*index = count;
+			rc = 0;
 			goto out_unlock;
 		}
 		++count;
@@ -278,6 +322,7 @@ static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
 		pr_info("SELinux:  Context %s is not valid (left unmapped).\n",
 			context->str);
 
+	sidtab_rcache_push(s, count);
 	*index = count;
 
 	/* write entries before writing new count */
diff --git a/security/selinux/ss/sidtab.h b/security/selinux/ss/sidtab.h
index 292512792a70..2bf50bf12751 100644
--- a/security/selinux/ss/sidtab.h
+++ b/security/selinux/ss/sidtab.h
@@ -66,12 +66,17 @@ struct sidtab_convert_params {
 	struct sidtab *target;
 };
 
+#define SIDTAB_RCACHE_SIZE 4
+
 struct sidtab {
 	union sidtab_entry_inner roots[SIDTAB_MAX_LEVEL + 1];
 	atomic_t count;
 	struct sidtab_convert_params *convert;
 	spinlock_t lock;
 
+	/* reverse lookup cache */
+	atomic_t rcache[SIDTAB_RCACHE_SIZE];
+
 	/* index == SID - 1 (no entry for SECSID_NULL) */
 	struct sidtab_isid_entry isids[SECINITSID_NUM];
 };
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v2 2/4] [squash] do not store entry for SECSID_NULL
  2018-11-27 10:36 ` [RFC PATCH v2 2/4] [squash] do not store entry for SECSID_NULL Ondrej Mosnacek
@ 2018-11-27 17:00   ` Stephen Smalley
  2018-11-27 17:14     ` Stephen Smalley
  2018-11-27 19:45     ` Ondrej Mosnacek
  0 siblings, 2 replies; 11+ messages in thread
From: Stephen Smalley @ 2018-11-27 17:00 UTC (permalink / raw)
  To: Ondrej Mosnacek, selinux, Paul Moore

On 11/27/18 5:36 AM, Ondrej Mosnacek wrote:
> This patch is kept separate only for review. Eventually it will be
> folded into the previous patch.

This one triggers a lot of warnings (security_compute_av: unrecognized 
SID 0, security_sid_to_context_core: unrecognized SID 0) and some 
failures during selinux-testsuite inet_socket tests.  While the policy 
doesn't provide an entry for SECSID_NULL, the sidtab search logic was 
remapping it to the unlabeled context and that was apparently being 
relied upon by the labeled networking code IIUC.


> 
> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> ---
>   security/selinux/ss/policydb.c |  2 +-
>   security/selinux/ss/sidtab.c   | 25 ++++++++++++++++---------
>   security/selinux/ss/sidtab.h   |  3 ++-
>   3 files changed, 19 insertions(+), 11 deletions(-)
> 
> diff --git a/security/selinux/ss/policydb.c b/security/selinux/ss/policydb.c
> index 59359fa0bd74..a50d625e7946 100644
> --- a/security/selinux/ss/policydb.c
> +++ b/security/selinux/ss/policydb.c
> @@ -912,7 +912,7 @@ int policydb_load_isids(struct policydb *p, struct sidtab *s)
>   			sidtab_destroy(s);
>   			goto out;
>   		}
> -		if (c->sid[0] > SECINITSID_NUM) {
> +		if (c->sid[0] == SECSID_NULL || c->sid[0] > SECINITSID_NUM) {
>   			pr_err("SELinux:  Initial SID %s out of range.\n",
>   				c->u.name);
>   			sidtab_destroy(s);
> diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
> index fd8115b211a6..e157d8240cf1 100644
> --- a/security/selinux/ss/sidtab.c
> +++ b/security/selinux/ss/sidtab.c
> @@ -23,7 +23,7 @@ int sidtab_init(struct sidtab *s)
>   	if (!s->htable)
>   		return -ENOMEM;
>   
> -	for (i = 0; i <= SECINITSID_NUM; i++)
> +	for (i = 0; i < SECINITSID_NUM; i++)
>   		s->isids[i].set = 0;
>   
>   	for (i = 0; i < SIDTAB_SIZE; i++)
> @@ -86,8 +86,15 @@ static int sidtab_insert(struct sidtab *s, u32 sid, struct context *context)
>   
>   int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context)
>   {
> -	struct sidtab_isid_entry *entry = &s->isids[sid];
> -	int rc = context_cpy(&entry->context, context);
> +	struct sidtab_isid_entry *entry;
> +	int rc;
> +
> +	if (sid == 0 || sid > SECINITSID_NUM)
> +		return -EINVAL;
> +
> +	entry = &s->isids[sid - 1];
> +
> +	rc = context_cpy(&entry->context, context);
>   	if (rc)
>   		return rc;
>   
> @@ -116,19 +123,19 @@ static struct context *sidtab_search_core(struct sidtab *s, u32 sid, int force)
>   	struct context *context;
>   	struct sidtab_isid_entry *entry;
>   
> -	if (!s)
> +	if (!s || sid == 0)
>   		return NULL;
>   
>   	if (sid > SECINITSID_NUM) {
>   		context = sidtab_lookup(s, sid - (SECINITSID_NUM + 1));
>   	} else {
> -		entry = &s->isids[sid];
> +		entry = &s->isids[sid - 1];
>   		context = entry->set ? &entry->context : NULL;
>   	}
>   	if (context && (!context->len || force))
>   		return context;
>   
> -	entry = &s->isids[SECINITSID_UNLABELED];
> +	entry = &s->isids[SECINITSID_UNLABELED - 1];
>   	return entry->set ? &entry->context : NULL;
>   }
>   
> @@ -283,11 +290,11 @@ int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid)
>   	int rc;
>   	u32 i;
>   
> -	for (i = 0; i <= SECINITSID_NUM; i++) {
> +	for (i = 0; i < SECINITSID_NUM; i++) {
>   		struct sidtab_isid_entry *entry = &s->isids[i];
>   
>   		if (entry->set && context_cmp(context, &entry->context)) {
> -			*sid = i;
> +			*sid = i + 1;
>   			return 0;
>   		}
>   	}
> @@ -334,7 +341,7 @@ void sidtab_destroy(struct sidtab *s)
>   	if (!s)
>   		return;
>   
> -	for (i = 0; i <= SECINITSID_NUM; i++)
> +	for (i = 0; i < SECINITSID_NUM; i++)
>   		if (s->isids[i].set)
>   			context_destroy(&s->isids[i].context);
>   
> diff --git a/security/selinux/ss/sidtab.h b/security/selinux/ss/sidtab.h
> index dc0a80bc8894..e657ae6bf996 100644
> --- a/security/selinux/ss/sidtab.h
> +++ b/security/selinux/ss/sidtab.h
> @@ -36,7 +36,8 @@ struct sidtab {
>   	struct sidtab_node *cache[SIDTAB_CACHE_LEN];
>   	spinlock_t lock;
>   
> -	struct sidtab_isid_entry isids[SECINITSID_NUM + 1];
> +	/* index == SID - 1 (no entry for SECSID_NULL) */
> +	struct sidtab_isid_entry isids[SECINITSID_NUM];
>   };
>   
>   int sidtab_init(struct sidtab *s);
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v2 2/4] [squash] do not store entry for SECSID_NULL
  2018-11-27 17:00   ` Stephen Smalley
@ 2018-11-27 17:14     ` Stephen Smalley
  2018-11-27 19:45     ` Ondrej Mosnacek
  1 sibling, 0 replies; 11+ messages in thread
From: Stephen Smalley @ 2018-11-27 17:14 UTC (permalink / raw)
  To: Ondrej Mosnacek, selinux, Paul Moore

On 11/27/18 12:00 PM, Stephen Smalley wrote:
> On 11/27/18 5:36 AM, Ondrej Mosnacek wrote:
>> This patch is kept separate only for review. Eventually it will be
>> folded into the previous patch.
> 
> This one triggers a lot of warnings (security_compute_av: unrecognized 
> SID 0, security_sid_to_context_core: unrecognized SID 0) and some 
> failures during selinux-testsuite inet_socket tests.  While the policy 
> doesn't provide an entry for SECSID_NULL, the sidtab search logic was 
> remapping it to the unlabeled context and that was apparently being 
> relied upon by the labeled networking code IIUC.

NB I had active ssh sessions to the system at the time of the test (and 
was in fact running the testsuite in one of the ssh sessions).  All of 
the sessions froze during the inet_sockets tests, presumably when 
labeled networking was configured, and then came back to life later, 
presumably once labeled network was un-configured.

> 
> 
>>
>> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
>> ---
>>   security/selinux/ss/policydb.c |  2 +-
>>   security/selinux/ss/sidtab.c   | 25 ++++++++++++++++---------
>>   security/selinux/ss/sidtab.h   |  3 ++-
>>   3 files changed, 19 insertions(+), 11 deletions(-)
>>
>> diff --git a/security/selinux/ss/policydb.c 
>> b/security/selinux/ss/policydb.c
>> index 59359fa0bd74..a50d625e7946 100644
>> --- a/security/selinux/ss/policydb.c
>> +++ b/security/selinux/ss/policydb.c
>> @@ -912,7 +912,7 @@ int policydb_load_isids(struct policydb *p, struct 
>> sidtab *s)
>>               sidtab_destroy(s);
>>               goto out;
>>           }
>> -        if (c->sid[0] > SECINITSID_NUM) {
>> +        if (c->sid[0] == SECSID_NULL || c->sid[0] > SECINITSID_NUM) {
>>               pr_err("SELinux:  Initial SID %s out of range.\n",
>>                   c->u.name);
>>               sidtab_destroy(s);
>> diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
>> index fd8115b211a6..e157d8240cf1 100644
>> --- a/security/selinux/ss/sidtab.c
>> +++ b/security/selinux/ss/sidtab.c
>> @@ -23,7 +23,7 @@ int sidtab_init(struct sidtab *s)
>>       if (!s->htable)
>>           return -ENOMEM;
>> -    for (i = 0; i <= SECINITSID_NUM; i++)
>> +    for (i = 0; i < SECINITSID_NUM; i++)
>>           s->isids[i].set = 0;
>>       for (i = 0; i < SIDTAB_SIZE; i++)
>> @@ -86,8 +86,15 @@ static int sidtab_insert(struct sidtab *s, u32 sid, 
>> struct context *context)
>>   int sidtab_set_initial(struct sidtab *s, u32 sid, struct context 
>> *context)
>>   {
>> -    struct sidtab_isid_entry *entry = &s->isids[sid];
>> -    int rc = context_cpy(&entry->context, context);
>> +    struct sidtab_isid_entry *entry;
>> +    int rc;
>> +
>> +    if (sid == 0 || sid > SECINITSID_NUM)
>> +        return -EINVAL;
>> +
>> +    entry = &s->isids[sid - 1];
>> +
>> +    rc = context_cpy(&entry->context, context);
>>       if (rc)
>>           return rc;
>> @@ -116,19 +123,19 @@ static struct context *sidtab_search_core(struct 
>> sidtab *s, u32 sid, int force)
>>       struct context *context;
>>       struct sidtab_isid_entry *entry;
>> -    if (!s)
>> +    if (!s || sid == 0)
>>           return NULL;
>>       if (sid > SECINITSID_NUM) {
>>           context = sidtab_lookup(s, sid - (SECINITSID_NUM + 1));
>>       } else {
>> -        entry = &s->isids[sid];
>> +        entry = &s->isids[sid - 1];
>>           context = entry->set ? &entry->context : NULL;
>>       }
>>       if (context && (!context->len || force))
>>           return context;
>> -    entry = &s->isids[SECINITSID_UNLABELED];
>> +    entry = &s->isids[SECINITSID_UNLABELED - 1];
>>       return entry->set ? &entry->context : NULL;
>>   }
>> @@ -283,11 +290,11 @@ int sidtab_context_to_sid(struct sidtab *s, 
>> struct context *context, u32 *sid)
>>       int rc;
>>       u32 i;
>> -    for (i = 0; i <= SECINITSID_NUM; i++) {
>> +    for (i = 0; i < SECINITSID_NUM; i++) {
>>           struct sidtab_isid_entry *entry = &s->isids[i];
>>           if (entry->set && context_cmp(context, &entry->context)) {
>> -            *sid = i;
>> +            *sid = i + 1;
>>               return 0;
>>           }
>>       }
>> @@ -334,7 +341,7 @@ void sidtab_destroy(struct sidtab *s)
>>       if (!s)
>>           return;
>> -    for (i = 0; i <= SECINITSID_NUM; i++)
>> +    for (i = 0; i < SECINITSID_NUM; i++)
>>           if (s->isids[i].set)
>>               context_destroy(&s->isids[i].context);
>> diff --git a/security/selinux/ss/sidtab.h b/security/selinux/ss/sidtab.h
>> index dc0a80bc8894..e657ae6bf996 100644
>> --- a/security/selinux/ss/sidtab.h
>> +++ b/security/selinux/ss/sidtab.h
>> @@ -36,7 +36,8 @@ struct sidtab {
>>       struct sidtab_node *cache[SIDTAB_CACHE_LEN];
>>       spinlock_t lock;
>> -    struct sidtab_isid_entry isids[SECINITSID_NUM + 1];
>> +    /* index == SID - 1 (no entry for SECSID_NULL) */
>> +    struct sidtab_isid_entry isids[SECINITSID_NUM];
>>   };
>>   int sidtab_init(struct sidtab *s);
>>
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v2 3/4] selinux: overhaul sidtab to fix bug and improve performance
  2018-11-27 10:36 ` [RFC PATCH v2 3/4] selinux: overhaul sidtab to fix bug and improve performance Ondrej Mosnacek
@ 2018-11-27 19:41   ` Stephen Smalley
  2018-11-27 20:03     ` Ondrej Mosnacek
  0 siblings, 1 reply; 11+ messages in thread
From: Stephen Smalley @ 2018-11-27 19:41 UTC (permalink / raw)
  To: Ondrej Mosnacek, selinux, Paul Moore

On 11/27/18 5:36 AM, Ondrej Mosnacek wrote:
> Before this patch, during a policy reload the sidtab would become frozen
> and trying to map a new context to SID would be unable to add a new
> entry to sidtab and fail with -ENOMEM.
> 
> Such failures are usually propagated into userspace, which has no way of
> distignuishing them from actual allocation failures and thus doesn't
> handle them gracefully. Such situation can be triggered e.g. by the
> following reproducer:
> 
>      while true; do load_policy; echo -n .; sleep 0.1; done &
>      for (( i = 0; i < 1024; i++ )); do
>          runcon -l s0:c$i echo -n x || break
>          # or:
>          # chcon -l s0:c$i <some_file> || break
>      done
> 
> This patch overhauls the sidtab so it doesn't need to be frozen during
> policy reload, thus solving the above problem.
> 
> The new SID table leverages the fact that SIDs are allocated
> sequentially and are never invalidated and stores them in linear buckets
> indexed by a tree structure. This brings several advantages:
>    1. Fast SID -> context lookup - this lookup can now be done in
>       logarithmic time complexity (usually in less than 4 array lookups)
>       and can still be done safely without locking.
>    2. No need to re-search the whole table on reverse lookup miss - after
>       acquiring the spinlock only the newly added entries need to be
>       searched, which means that reverse lookups that end up inserting a
>       new entry are now about twice as fast.
>    3. No need to freeze sidtab during policy reload - it is now possible
>       to handle insertion of new entries even during sidtab conversion.
> 
> The tree structure of the new sidtab is able to grow automatically to up
> to about 2^31 entries (at which point it should not have more than about
> 4 tree levels). The old sidtab had a theoretical capacity of almost 2^32
> entries, but half of that is still more than enough since by that point
> the reverse table lookups would become unusably slow anyway...
> 
> The number of entries per tree node is selected automatically so that
> each node fits into a single page, which should be the easiest size for
> kmalloc() to handle.
> 
> Note that the new implementation does not have a cache for recent
> reverse lookups as the old one had. Such cache is, however, added in a
> subsequent patch.
> 
> Tested by selinux-testsuite and thoroughly tortured by this simple
> stress test:
> ```
> function rand_cat() {
> 	echo $(( $RANDOM % 1024 ))
> }
> 
> function do_work() {
> 	while true; do
> 		echo -n "system_u:system_r:kernel_t:s0:c$(rand_cat),c$(rand_cat)" \
> 			>/sys/fs/selinux/context 2>/dev/null || true
> 	done
> }
> 
> do_work >/dev/null &
> do_work >/dev/null &
> do_work >/dev/null &
> 
> while load_policy; do echo -n .; sleep 0.1; done
> 
> kill %1
> kill %2
> kill %3
> ```
> 
> Reported-by: Orion Poplawski <orion@nwra.com>
> Reported-by: Li Kun <hw.likun@huawei.com>
> Link: https://github.com/SELinuxProject/selinux-kernel/issues/38
> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> ---
>   security/selinux/ss/mls.c      |  23 +-
>   security/selinux/ss/mls.h      |   3 +-
>   security/selinux/ss/services.c |  90 +++---
>   security/selinux/ss/sidtab.c   | 522 +++++++++++++++++++--------------
>   security/selinux/ss/sidtab.h   |  75 +++--
>   5 files changed, 402 insertions(+), 311 deletions(-)
> 
> diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c
> index 2fe459df3c85..267ae4f9be79 100644
> --- a/security/selinux/ss/mls.c
> +++ b/security/selinux/ss/mls.c
> @@ -436,16 +436,17 @@ int mls_setup_user_range(struct policydb *p,
>   
>   /*
>    * Convert the MLS fields in the security context
> - * structure `c' from the values specified in the
> - * policy `oldp' to the values specified in the policy `newp'.
> + * structure `oldc' from the values specified in the
> + * policy `oldp' to the values specified in the policy `newp',
> + * storing the resulting context in `newc'.
>    */
>   int mls_convert_context(struct policydb *oldp,
>   			struct policydb *newp,
> -			struct context *c)
> +			struct context *oldc,
> +			struct context *newc)
>   {
>   	struct level_datum *levdatum;
>   	struct cat_datum *catdatum;
> -	struct ebitmap bitmap;
>   	struct ebitmap_node *node;
>   	int l, i;
>   
> @@ -455,28 +456,24 @@ int mls_convert_context(struct policydb *oldp,
>   	for (l = 0; l < 2; l++) {
>   		levdatum = hashtab_search(newp->p_levels.table,
>   					  sym_name(oldp, SYM_LEVELS,
> -						   c->range.level[l].sens - 1));
> +						   oldc->range.level[l].sens - 1));
>   
>   		if (!levdatum)
>   			return -EINVAL;
> -		c->range.level[l].sens = levdatum->level->sens;
> +		newc->range.level[l].sens = levdatum->level->sens;
>   
> -		ebitmap_init(&bitmap);
> -		ebitmap_for_each_positive_bit(&c->range.level[l].cat, node, i) {
> +		ebitmap_for_each_positive_bit(&oldc->range.level[l].cat, node, i) {
>   			int rc;
>   
>   			catdatum = hashtab_search(newp->p_cats.table,
>   						  sym_name(oldp, SYM_CATS, i));
>   			if (!catdatum)
>   				return -EINVAL;
> -			rc = ebitmap_set_bit(&bitmap, catdatum->value - 1, 1);
> +			rc = ebitmap_set_bit(&newc->range.level[l].cat,
> +					     catdatum->value - 1, 1);
>   			if (rc)
>   				return rc;
> -
> -			cond_resched();
>   		}
> -		ebitmap_destroy(&c->range.level[l].cat);
> -		c->range.level[l].cat = bitmap;
>   	}
>   
>   	return 0;
> diff --git a/security/selinux/ss/mls.h b/security/selinux/ss/mls.h
> index 67093647576d..7954b1e60b64 100644
> --- a/security/selinux/ss/mls.h
> +++ b/security/selinux/ss/mls.h
> @@ -46,7 +46,8 @@ int mls_range_set(struct context *context, struct mls_range *range);
>   
>   int mls_convert_context(struct policydb *oldp,
>   			struct policydb *newp,
> -			struct context *context);
> +			struct context *oldc,
> +			struct context *newc);
>   
>   int mls_compute_sid(struct policydb *p,
>   		    struct context *scontext,
> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
> index 30170d4c567a..3c5887838e12 100644
> --- a/security/selinux/ss/services.c
> +++ b/security/selinux/ss/services.c
> @@ -1907,19 +1907,16 @@ struct convert_context_args {
>   
>   /*
>    * Convert the values in the security context
> - * structure `c' from the values specified
> + * structure `oldc' from the values specified
>    * in the policy `p->oldp' to the values specified
> - * in the policy `p->newp'.  Verify that the
> - * context is valid under the new policy.
> + * in the policy `p->newp', storing the new context
> + * in `newc'.  Verify that the context is valid
> + * under the new policy.
>    */
> -static int convert_context(u32 key,
> -			   struct context *c,
> -			   void *p)
> +static int convert_context(struct context *oldc, struct context *newc, void *p)
>   {
>   	struct convert_context_args *args;
> -	struct context oldc;
>   	struct ocontext *oc;
> -	struct mls_range *range;
>   	struct role_datum *role;
>   	struct type_datum *typdatum;
>   	struct user_datum *usrdatum;
> @@ -1929,23 +1926,18 @@ static int convert_context(u32 key,
>   
>   	args = p;
>   
> -	if (c->str) {
> -		struct context ctx;
> -
> +	if (oldc->str) {
>   		rc = -ENOMEM;
> -		s = kstrdup(c->str, GFP_KERNEL);
> +		s = kstrdup(oldc->str, GFP_KERNEL);
>   		if (!s)
>   			goto out;
>   
>   		rc = string_to_context_struct(args->newp, NULL, s,
> -					      &ctx, SECSID_NULL);
> +					      newc, SECSID_NULL);
>   		kfree(s);
>   		if (!rc) {
>   			pr_info("SELinux:  Context %s became valid (mapped).\n",
> -			       c->str);
> -			/* Replace string with mapped representation. */
> -			kfree(c->str);
> -			memcpy(c, &ctx, sizeof(*c));
> +				oldc->str);
>   			goto out;
>   		} else if (rc == -EINVAL) {
>   			/* Retain string representation for later mapping. */
I haven't looked closely at the new sidtab implementation yet, but I was 
encountering KASAN failures in testing and noticed that they seemed to 
be tied to invalid/unmapped contexts in the filesystem.

Reproducer:
$ su
# setenforce 0
# touch foo
# setfattr -n security.selinux -v thisisnotavalidcontext foo
# ls -Z foo
# load_policy
# ls -Z foo

The issue in this case seems to be that you aren't actually retaining 
the string representation of the invalid/unmapped context across a 
reload after this patch because you never copy it to newc, e.g. replace 
the rc = 0; line with rc = context_cpy(newc, oldc); in the EINVAL case 
above.  Otherwise, you'll be left with an uninitialized newc. 
Previously we didn't have to do anything here because we were just 
leaving c (oldc) intact.

> @@ -1954,51 +1946,42 @@ static int convert_context(u32 key,
>   		} else {
>   			/* Other error condition, e.g. ENOMEM. */
>   			pr_err("SELinux:   Unable to map context %s, rc = %d.\n",
> -			       c->str, -rc);
> +			       oldc->str, -rc);
>   			goto out;
>   		}
>   	}
>   
> -	rc = context_cpy(&oldc, c);
> -	if (rc)
> -		goto out;
> +	context_init(newc);
>   
>   	/* Convert the user. */
>   	rc = -EINVAL;
>   	usrdatum = hashtab_search(args->newp->p_users.table,
> -				  sym_name(args->oldp, SYM_USERS, c->user - 1));
> +				  sym_name(args->oldp, SYM_USERS, oldc->user - 1));
>   	if (!usrdatum)
>   		goto bad;
> -	c->user = usrdatum->value;
> +	newc->user = usrdatum->value;
>   
>   	/* Convert the role. */
>   	rc = -EINVAL;
>   	role = hashtab_search(args->newp->p_roles.table,
> -			      sym_name(args->oldp, SYM_ROLES, c->role - 1));
> +			      sym_name(args->oldp, SYM_ROLES, oldc->role - 1));
>   	if (!role)
>   		goto bad;
> -	c->role = role->value;
> +	newc->role = role->value;
>   
>   	/* Convert the type. */
>   	rc = -EINVAL;
>   	typdatum = hashtab_search(args->newp->p_types.table,
> -				  sym_name(args->oldp, SYM_TYPES, c->type - 1));
> +				  sym_name(args->oldp, SYM_TYPES, oldc->type - 1));
>   	if (!typdatum)
>   		goto bad;
> -	c->type = typdatum->value;
> +	newc->type = typdatum->value;
>   
>   	/* Convert the MLS fields if dealing with MLS policies */
>   	if (args->oldp->mls_enabled && args->newp->mls_enabled) {
> -		rc = mls_convert_context(args->oldp, args->newp, c);
> +		rc = mls_convert_context(args->oldp, args->newp, oldc, newc);
>   		if (rc)
>   			goto bad;
> -	} else if (args->oldp->mls_enabled && !args->newp->mls_enabled) {
> -		/*
> -		 * Switching between MLS and non-MLS policy:
> -		 * free any storage used by the MLS fields in the
> -		 * context for all existing entries in the sidtab.
> -		 */
> -		mls_context_destroy(c);
>   	} else if (!args->oldp->mls_enabled && args->newp->mls_enabled) {
>   		/*
>   		 * Switching between non-MLS and MLS policy:
> @@ -2016,36 +1999,31 @@ static int convert_context(u32 key,
>   				" the initial SIDs list\n");
>   			goto bad;
>   		}
> -		range = &oc->context[0].range;
> -		rc = mls_range_set(c, range);
> +		rc = mls_range_set(newc, &oc->context[0].range);
>   		if (rc)
>   			goto bad;
>   	}
>   
>   	/* Check the validity of the new context. */
> -	if (!policydb_context_isvalid(args->newp, c)) {
> -		rc = convert_context_handle_invalid_context(args->state,
> -							    &oldc);
> +	if (!policydb_context_isvalid(args->newp, newc)) {
> +		rc = convert_context_handle_invalid_context(args->state, oldc);
>   		if (rc)
>   			goto bad;
>   	}
>   
> -	context_destroy(&oldc);
> -
>   	rc = 0;
>   out:
>   	return rc;
>   bad:
>   	/* Map old representation to string and save it. */
> -	rc = context_struct_to_string(args->oldp, &oldc, &s, &len);
> +	rc = context_struct_to_string(args->oldp, oldc, &s, &len);
>   	if (rc)
>   		return rc;
> -	context_destroy(&oldc);
> -	context_destroy(c);
> -	c->str = s;
> -	c->len = len;
> +	context_destroy(newc);
> +	newc->str = s;
> +	newc->len = len;
>   	pr_info("SELinux:  Context %s became invalid (unmapped).\n",
> -	       c->str);
> +		newc->str);
>   	rc = 0;
>   	goto out;
>   }
> @@ -2091,6 +2069,7 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
>   	struct policydb *oldpolicydb, *newpolicydb;
>   	struct selinux_mapping *oldmapping;
>   	struct selinux_map newmap;
> +	struct sidtab_convert_params convert_params;
>   	struct convert_context_args args;
>   	u32 seqno;
>   	int rc = 0;
> @@ -2147,12 +2126,6 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
>   		goto out;
>   	}
>   
> -	oldsidtab = state->ss->sidtab;
> -
> -#if 0
> -	sidtab_hash_eval(oldsidtab, "sids");
> -#endif
> -
>   	rc = policydb_read(newpolicydb, fp);
>   	if (rc) {
>   		kfree(newsidtab);
> @@ -2184,6 +2157,8 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
>   		goto err;
>   	}
>   
> +	oldsidtab = state->ss->sidtab;
> +
>   	/*
>   	 * Convert the internal representations of contexts
>   	 * in the new SID table.
> @@ -2191,7 +2166,12 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
>   	args.state = state;
>   	args.oldp = policydb;
>   	args.newp = newpolicydb;
> -	rc = sidtab_convert(oldsidtab, newsidtab, convert_context, &args);
> +
> +	convert_params.func = convert_context;
> +	convert_params.args = &args;
> +	convert_params.target = newsidtab;
> +
> +	rc = sidtab_convert(oldsidtab, &convert_params);
>   	if (rc) {
>   		pr_err("SELinux:  unable to convert the internal"
>   			" representation of contexts in the new SID"
> diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
> index e157d8240cf1..a82deecfac09 100644
> --- a/security/selinux/ss/sidtab.c
> +++ b/security/selinux/ss/sidtab.c
> @@ -2,88 +2,38 @@
>   /*
>    * Implementation of the SID table type.
>    *
> - * Author : Stephen Smalley, <sds@tycho.nsa.gov>
> + * Original author: Stephen Smalley, <sds@tycho.nsa.gov>
> + * Author: Ondrej Mosnacek, <omosnacek@gmail.com>
> + *
> + * Copyright (C) 2018 Red Hat, Inc.
>    */
> +#include <linux/errno.h>
>   #include <linux/kernel.h>
>   #include <linux/slab.h>
> +#include <linux/sched.h>
>   #include <linux/spinlock.h>
> -#include <linux/errno.h>
> +#include <linux/atomic.h>
>   #include "flask.h"
>   #include "security.h"
>   #include "sidtab.h"
>   
> -#define SIDTAB_HASH(sid) \
> -(sid & SIDTAB_HASH_MASK)
> -
>   int sidtab_init(struct sidtab *s)
>   {
> -	int i;
> +	u32 i;
>   
> -	s->htable = kmalloc_array(SIDTAB_SIZE, sizeof(*s->htable), GFP_ATOMIC);
> -	if (!s->htable)
> -		return -ENOMEM;
> +	memset(s->roots, 0, sizeof(s->roots));
>   
>   	for (i = 0; i < SECINITSID_NUM; i++)
>   		s->isids[i].set = 0;
>   
> -	for (i = 0; i < SIDTAB_SIZE; i++)
> -		s->htable[i] = NULL;
> +	atomic_set(&s->count, 0);
>   
> -	for (i = 0; i < SIDTAB_CACHE_LEN; i++)
> -		s->cache[i] = NULL;
> +	s->convert = NULL;
>   
> -	s->nel = 0;
> -	s->next_sid = 0;
> -	s->shutdown = 0;
>   	spin_lock_init(&s->lock);
>   	return 0;
>   }
>   
> -static int sidtab_insert(struct sidtab *s, u32 sid, struct context *context)
> -{
> -	int hvalue;
> -	struct sidtab_node *prev, *cur, *newnode;
> -
> -	if (!s)
> -		return -ENOMEM;
> -
> -	hvalue = SIDTAB_HASH(sid);
> -	prev = NULL;
> -	cur = s->htable[hvalue];
> -	while (cur && sid > cur->sid) {
> -		prev = cur;
> -		cur = cur->next;
> -	}
> -
> -	if (cur && sid == cur->sid)
> -		return -EEXIST;
> -
> -	newnode = kmalloc(sizeof(*newnode), GFP_ATOMIC);
> -	if (!newnode)
> -		return -ENOMEM;
> -
> -	newnode->sid = sid;
> -	if (context_cpy(&newnode->context, context)) {
> -		kfree(newnode);
> -		return -ENOMEM;
> -	}
> -
> -	if (prev) {
> -		newnode->next = prev->next;
> -		wmb();
> -		prev->next = newnode;
> -	} else {
> -		newnode->next = s->htable[hvalue];
> -		wmb();
> -		s->htable[hvalue] = newnode;
> -	}
> -
> -	s->nel++;
> -	if (sid >= s->next_sid)
> -		s->next_sid = sid + 1;
> -	return 0;
> -}
> -
>   int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context)
>   {
>   	struct sidtab_isid_entry *entry;
> @@ -102,20 +52,90 @@ int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context)
>   	return 0;
>   }
>   
> -static struct context *sidtab_lookup(struct sidtab *s, u32 sid)
> +static u32 sidtab_level_from_count(u32 count)
>   {
> -	int hvalue;
> -	struct sidtab_node *cur;
> +	u32 capacity = SIDTAB_LEAF_ENTRIES;
> +	u32 level = 0;
>   
> -	hvalue = SIDTAB_HASH(sid);
> -	cur = s->htable[hvalue];
> -	while (cur && sid > cur->sid)
> -		cur = cur->next;
> +	while (count > capacity) {
> +		capacity <<= SIDTAB_INNER_SHIFT;
> +		++level;
> +	}
> +	return level;
> +}
> +
> +static int sidtab_alloc_roots(struct sidtab *s, u32 level)
> +{
> +	u32 l;
> +
> +	if (!s->roots[0].ptr_leaf) {
> +		s->roots[0].ptr_leaf = kzalloc(SIDTAB_NODE_ALLOC_SIZE,
> +					       GFP_ATOMIC);
> +		if (!s->roots[0].ptr_leaf)
> +			return -ENOMEM;
> +	}
> +	for (l = 1; l <= level; ++l)
> +		if (!s->roots[l].ptr_inner) {
> +			s->roots[l].ptr_inner = kzalloc(SIDTAB_NODE_ALLOC_SIZE,
> +							GFP_ATOMIC);
> +			if (!s->roots[l].ptr_inner)
> +				return -ENOMEM;
> +			s->roots[l].ptr_inner->entries[0] = s->roots[l - 1];
> +		}
> +	return 0;
> +}
>   
> -	if (!cur || sid != cur->sid)
> +static struct context *sidtab_do_lookup(struct sidtab *s, u32 index, int alloc)
> +{
> +	union sidtab_entry_inner *entry;
> +	u32 level, capacity_shift, leaf_index = index / SIDTAB_LEAF_ENTRIES;
> +
> +	/* find the level of the subtree we need */
> +	level = sidtab_level_from_count(index + 1);
> +	capacity_shift = level * SIDTAB_INNER_SHIFT;
> +
> +	/* allocate roots if needed */
> +	if (alloc && sidtab_alloc_roots(s, level) != 0)
>   		return NULL;
>   
> -	return &cur->context;
> +	/* lookup inside the subtree */
> +	entry = &s->roots[level];
> +	while (level != 0) {
> +		capacity_shift -= SIDTAB_INNER_SHIFT;
> +		--level;
> +
> +		entry = &entry->ptr_inner->entries[leaf_index >> capacity_shift];
> +		leaf_index &= ((u32)1 << capacity_shift) - 1;
> +
> +		if (!entry->ptr_inner) {
> +			if (alloc)
> +				entry->ptr_inner = kzalloc(SIDTAB_NODE_ALLOC_SIZE,
> +							   GFP_ATOMIC);
> +			if (!entry->ptr_inner)
> +				return NULL;
> +		}
> +	}
> +	if (!entry->ptr_leaf) {
> +		if (alloc)
> +			entry->ptr_leaf = kzalloc(SIDTAB_NODE_ALLOC_SIZE,
> +						  GFP_ATOMIC);
> +		if (!entry->ptr_leaf)
> +			return NULL;
> +	}
> +	return &entry->ptr_leaf->entries[index % SIDTAB_LEAF_ENTRIES].context;
> +}
> +
> +static struct context *sidtab_lookup(struct sidtab *s, u32 index)
> +{
> +	u32 count = (u32)atomic_read(&s->count);
> +
> +	if (index >= count)
> +		return NULL;
> +
> +	/* read entries after reading count */
> +	smp_rmb();
> +
> +	return sidtab_do_lookup(s, index, 0);
>   }
>   
>   static struct context *sidtab_search_core(struct sidtab *s, u32 sid, int force)
> @@ -123,7 +143,7 @@ static struct context *sidtab_search_core(struct sidtab *s, u32 sid, int force)
>   	struct context *context;
>   	struct sidtab_isid_entry *entry;
>   
> -	if (!s || sid == 0)
> +	if (sid == 0)
>   		return NULL;
>   
>   	if (sid > SECINITSID_NUM) {
> @@ -149,140 +169,126 @@ struct context *sidtab_search_force(struct sidtab *s, u32 sid)
>   	return sidtab_search_core(s, sid, 1);
>   }
>   
> -static int sidtab_map(struct sidtab *s,
> -		      int (*apply)(u32 sid,
> -				   struct context *context,
> -				   void *args),
> -		      void *args)
> +static int sidtab_find_context(union sidtab_entry_inner entry,
> +			       u32 *pos, u32 count, u32 level,
> +			       struct context *context, u32 *index)
>   {
> -	int i, rc = 0;
> -	struct sidtab_node *cur;
> +	int rc;
> +	u32 i;
>   
> -	if (!s)
> -		goto out;
> +	if (level != 0) {
> +		struct sidtab_node_inner *node = entry.ptr_inner;
>   
> -	for (i = 0; i < SIDTAB_SIZE; i++) {
> -		cur = s->htable[i];
> -		while (cur) {
> -			rc = apply(cur->sid, &cur->context, args);
> -			if (rc)
> -				goto out;
> -			cur = cur->next;
> +		i = 0;
> +		while (i < SIDTAB_INNER_ENTRIES && *pos < count) {
> +			rc = sidtab_find_context(node->entries[i],
> +						 pos, count, level - 1,
> +						 context, index);
> +			if (rc == 0)
> +				return 0;
> +			i++;
>   		}
> -	}
> -out:
> -	return rc;
> -}
> +	} else {
> +		struct sidtab_node_leaf *node = entry.ptr_leaf;
>   
> -/* Clone the SID into the new SID table. */
> -static int clone_sid(u32 sid, struct context *context, void *arg)
> -{
> -	struct sidtab *s = arg;
> -	return sidtab_insert(s, sid, context);
> +		i = 0;
> +		while (i < SIDTAB_LEAF_ENTRIES && *pos < count) {
> +			if (context_cmp(&node->entries[i].context, context)) {
> +				*index = *pos;
> +				return 0;
> +			}
> +			(*pos)++;
> +			i++;
> +		}
> +	}
> +	return -ENOENT;
>   }
>   
> -int sidtab_convert(struct sidtab *s, struct sidtab *news,
> -		   int (*convert)(u32 sid,
> -				  struct context *context,
> -				  void *args),
> -		   void *args)
> +static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
> +				 u32 *index)
>   {
>   	unsigned long flags;
> +	u32 count = (u32)atomic_read(&s->count);
> +	u32 count_locked, level, pos;
> +	struct sidtab_convert_params *convert;
> +	struct context *dst, *dst_convert;
>   	int rc;
>   
> -	spin_lock_irqsave(&s->lock, flags);
> -	s->shutdown = 1;
> -	spin_unlock_irqrestore(&s->lock, flags);
> +	level = sidtab_level_from_count(count);
>   
> -	rc = sidtab_map(s, clone_sid, news);
> -	if (rc)
> -		return rc;
> +	/* read entries after reading count */
> +	smp_rmb();
>   
> -	return sidtab_map(news, convert, args);
> -}
> +	pos = 0;
> +	rc = sidtab_find_context(s->roots[level], &pos, count, level,
> +				 context, index);
> +	if (rc == 0)
> +		return 0;
>   
> -static void sidtab_update_cache(struct sidtab *s, struct sidtab_node *n, int loc)
> -{
> -	BUG_ON(loc >= SIDTAB_CACHE_LEN);
> +	/* lock-free search failed: lock, re-search, and insert if not found */
> +	spin_lock_irqsave(&s->lock, flags);
>   
> -	while (loc > 0) {
> -		s->cache[loc] = s->cache[loc - 1];
> -		loc--;
> -	}
> -	s->cache[0] = n;
> -}
> +	convert = s->convert;
> +	count_locked = (u32)atomic_read(&s->count);
> +	level = sidtab_level_from_count(count_locked);
>   
> -static inline int sidtab_search_context(struct sidtab *s,
> -					struct context *context, u32 *sid)
> -{
> -	int i;
> -	struct sidtab_node *cur;
> -
> -	for (i = 0; i < SIDTAB_SIZE; i++) {
> -		cur = s->htable[i];
> -		while (cur) {
> -			if (context_cmp(&cur->context, context)) {
> -				sidtab_update_cache(s, cur, SIDTAB_CACHE_LEN - 1);
> -				*sid = cur->sid;
> -				return 0;
> -			}
> -			cur = cur->next;
> +	/* if count has changed before we acquired the lock, then catch up */
> +	while (count < count_locked) {
> +		if (context_cmp(sidtab_do_lookup(s, count, 0), context)) {
> +			rc = 0;
> +			*index = count;
> +			goto out_unlock;
>   		}
> +		++count;
>   	}
> -	return -ENOENT;
> -}
>   
> -static inline int sidtab_search_cache(struct sidtab *s, struct context *context,
> -				      u32 *sid)
> -{
> -	int i;
> -	struct sidtab_node *node;
> -
> -	for (i = 0; i < SIDTAB_CACHE_LEN; i++) {
> -		node = s->cache[i];
> -		if (unlikely(!node))
> -			return -ENOENT;
> -		if (context_cmp(&node->context, context)) {
> -			sidtab_update_cache(s, node, i);
> -			*sid = node->sid;
> -			return 0;
> -		}
> -	}
> -	return -ENOENT;
> -}
> +	/* insert context into new entry */
> +	rc = -ENOMEM;
> +	dst = sidtab_do_lookup(s, count, 1);
> +	if (!dst)
> +		goto out_unlock;
>   
> -static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
> -				 u32 *sid)
> -{
> -	int ret;
> -	unsigned long flags;
> +	rc = context_cpy(dst, context);
> +	if (rc)
> +		goto out_unlock;
> +
> +	/*
> +	 * if we are building a new sidtab, we need to convert the context
> +	 * and insert it there as well
> +	 */
> +	if (convert) {
> +		rc = -ENOMEM;
> +		dst_convert = sidtab_do_lookup(convert->target, count, 1);
> +		if (!dst_convert) {
> +			context_destroy(dst);
> +			goto out_unlock;
> +		}
>   
> -	ret = sidtab_search_cache(s, context, sid);
> -	if (ret)
> -		ret = sidtab_search_context(s, context, sid);
> -	if (ret) {
> -		spin_lock_irqsave(&s->lock, flags);
> -		/* Rescan now that we hold the lock. */
> -		ret = sidtab_search_context(s, context, sid);
> -		if (!ret)
> -			goto unlock_out;
> -		/* No SID exists for the context.  Allocate a new one. */
> -		if (s->next_sid == (UINT_MAX - SECINITSID_NUM - 1) || s->shutdown) {
> -			ret = -ENOMEM;
> -			goto unlock_out;
> +		rc = convert->func(context, dst_convert, convert->args);
> +		if (rc) {
> +			context_destroy(dst);
> +			goto out_unlock;
>   		}
> -		*sid = s->next_sid++;
> -		if (context->len)
> -			pr_info("SELinux:  Context %s is not valid (left unmapped).\n",
> -			       context->str);
> -		ret = sidtab_insert(s, *sid, context);
> -		if (ret)
> -			s->next_sid--;
> -unlock_out:
> -		spin_unlock_irqrestore(&s->lock, flags);
> +
> +		/* at this point we know the insert won't fail */
> +		atomic_set(&convert->target->count, count + 1);
>   	}
>   
> -	return ret;
> +	if (context->len)
> +		pr_info("SELinux:  Context %s is not valid (left unmapped).\n",
> +			context->str);
> +
> +	*index = count;
> +
> +	/* write entries before writing new count */
> +	smp_wmb();
> +
> +	atomic_set(&s->count, count + 1);
> +
> +	rc = 0;
> +out_unlock:
> +	spin_unlock_irqrestore(&s->lock, flags);
> +	return rc;
>   }
>   
>   int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid)
> @@ -306,58 +312,134 @@ int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid)
>   	return 0;
>   }
>   
> -void sidtab_hash_eval(struct sidtab *h, char *tag)
> +static int sidtab_convert_tree(union sidtab_entry_inner *edst,
> +			       union sidtab_entry_inner *esrc,
> +			       u32 *pos, u32 count, u32 level,
> +			       struct sidtab_convert_params *convert)
>   {
> -	int i, chain_len, slots_used, max_chain_len;
> -	struct sidtab_node *cur;
> -
> -	slots_used = 0;
> -	max_chain_len = 0;
> -	for (i = 0; i < SIDTAB_SIZE; i++) {
> -		cur = h->htable[i];
> -		if (cur) {
> -			slots_used++;
> -			chain_len = 0;
> -			while (cur) {
> -				chain_len++;
> -				cur = cur->next;
> -			}
> +	int rc;
> +	u32 i;
>   
> -			if (chain_len > max_chain_len)
> -				max_chain_len = chain_len;
> +	if (level != 0) {
> +		if (!edst->ptr_inner) {
> +			edst->ptr_inner = kzalloc(SIDTAB_NODE_ALLOC_SIZE, GFP_KERNEL);
> +			if (!edst->ptr_inner)
> +				return -ENOMEM;
> +		}
> +		i = 0;
> +		while (i < SIDTAB_INNER_ENTRIES && *pos < count) {
> +			rc = sidtab_convert_tree(&edst->ptr_inner->entries[i],
> +						 &esrc->ptr_inner->entries[i],
> +						 pos, count, level - 1, convert);
> +			if (rc)
> +				return rc;
> +			i++;
>   		}
> +	} else {
> +		if (!edst->ptr_leaf) {
> +			edst->ptr_leaf = kzalloc(SIDTAB_NODE_ALLOC_SIZE, GFP_KERNEL);
> +			if (!edst->ptr_leaf)
> +				return -ENOMEM;
> +		}
> +		i = 0;
> +		while (i < SIDTAB_LEAF_ENTRIES && *pos < count) {
> +			rc = convert->func(&esrc->ptr_leaf->entries[i].context,
> +					   &edst->ptr_leaf->entries[i].context,
> +					   convert->args);
> +			if (rc)
> +				return rc;
> +			(*pos)++;
> +			i++;
> +		}
> +		cond_resched();
> +	}
> +	return 0;
> +}
> +
> +int sidtab_convert(struct sidtab *s, struct sidtab_convert_params *params)
> +{
> +	unsigned long flags;
> +	u32 count, level, pos;
> +	int rc;
> +
> +	spin_lock_irqsave(&s->lock, flags);
> +
> +	/* concurrent policy loads are not allowed */
> +	if (s->convert) {
> +		spin_unlock_irqrestore(&s->lock, flags);
> +		return -EBUSY;
> +	}
> +
> +	count = (u32)atomic_read(&s->count);
> +	level = sidtab_level_from_count(count);
> +
> +	/* allocate last leaf in the new sidtab (to avoid race with live convert) */
> +	rc = sidtab_do_lookup(params->target, count - 1, 1) ? 0 : -ENOMEM;
> +	if (rc) {
> +		spin_unlock_irqrestore(&s->lock, flags);
> +		return rc;
> +	}
> +
> +	/* set count in case no new entries are added during conversion */
> +	atomic_set(&params->target->count, count);
> +
> +	/* enable live convert of new entries */
> +	s->convert = params;
> +
> +	/* we can safely do the rest of the conversion outside the lock */
> +	spin_unlock_irqrestore(&s->lock, flags);
> +
> +	pr_info("SELinux:  Converting %u SID table entries...\n", count);
> +
> +	/* convert all entries not covered by live convert */
> +	pos = 0;
> +	rc = sidtab_convert_tree(&params->target->roots[level], &s->roots[level],
> +				 &pos, count, level, params);
> +	if (rc) {
> +		/* we need to keep the old table - disable live convert */
> +		spin_lock_irqsave(&s->lock, flags);
> +		s->convert = NULL;
> +		spin_unlock_irqrestore(&s->lock, flags);
>   	}
> +	return rc;
> +}
> +
> +static void sidtab_destroy_tree(union sidtab_entry_inner entry, u32 level)
> +{
> +	u32 i;
> +
> +	if (level != 0) {
> +		struct sidtab_node_inner *node = entry.ptr_inner;
>   
> -	pr_debug("%s:  %d entries and %d/%d buckets used, longest "
> -	       "chain length %d\n", tag, h->nel, slots_used, SIDTAB_SIZE,
> -	       max_chain_len);
> +		if (!node)
> +			return;
> +
> +		for (i = 0; i < SIDTAB_INNER_ENTRIES; i++)
> +			sidtab_destroy_tree(node->entries[i], level - 1);
> +		kfree(node);
> +	} else {
> +		struct sidtab_node_leaf *node = entry.ptr_leaf;
> +
> +		if (!node)
> +			return;
> +
> +		for (i = 0; i < SIDTAB_LEAF_ENTRIES; i++)
> +			context_destroy(&node->entries[i].context);
> +		kfree(node);
> +	}
>   }
>   
>   void sidtab_destroy(struct sidtab *s)
>   {
> -	int i;
> -	struct sidtab_node *cur, *temp;
> -
> -	if (!s)
> -		return;
> +	u32 i, level;
>   
>   	for (i = 0; i < SECINITSID_NUM; i++)
>   		if (s->isids[i].set)
>   			context_destroy(&s->isids[i].context);
>   
> +	level = SIDTAB_MAX_LEVEL;
> +	while (level && !s->roots[level].ptr_inner)
> +		--level;
>   
> -	for (i = 0; i < SIDTAB_SIZE; i++) {
> -		cur = s->htable[i];
> -		while (cur) {
> -			temp = cur;
> -			cur = cur->next;
> -			context_destroy(&temp->context);
> -			kfree(temp);
> -		}
> -		s->htable[i] = NULL;
> -	}
> -	kfree(s->htable);
> -	s->htable = NULL;
> -	s->nel = 0;
> -	s->next_sid = 1;
> +	sidtab_destroy_tree(s->roots[level], level);
>   }
> diff --git a/security/selinux/ss/sidtab.h b/security/selinux/ss/sidtab.h
> index e657ae6bf996..292512792a70 100644
> --- a/security/selinux/ss/sidtab.h
> +++ b/security/selinux/ss/sidtab.h
> @@ -1,39 +1,75 @@
>   /* SPDX-License-Identifier: GPL-2.0 */
>   /*
> - * A security identifier table (sidtab) is a hash table
> + * A security identifier table (sidtab) is a lookup table
>    * of security context structures indexed by SID value.
>    *
> - * Author : Stephen Smalley, <sds@tycho.nsa.gov>
> + * Original author: Stephen Smalley, <sds@tycho.nsa.gov>
> + * Author: Ondrej Mosnacek, <omosnacek@gmail.com>
> + *
> + * Copyright (C) 2018 Red Hat, Inc.
>    */
>   #ifndef _SS_SIDTAB_H_
>   #define _SS_SIDTAB_H_
>   
> +#include <linux/spinlock_types.h>
> +#include <linux/log2.h>
> +
>   #include "context.h"
>   
> -struct sidtab_node {
> -	u32 sid;		/* security identifier */
> -	struct context context;	/* security context structure */
> -	struct sidtab_node *next;
> +struct sidtab_entry_leaf {
> +	struct context context;
> +};
> +
> +struct sidtab_node_inner;
> +struct sidtab_node_leaf;
> +
> +union sidtab_entry_inner {
> +	struct sidtab_node_inner *ptr_inner;
> +	struct sidtab_node_leaf  *ptr_leaf;
>   };
>   
> -#define SIDTAB_HASH_BITS 7
> -#define SIDTAB_HASH_BUCKETS (1 << SIDTAB_HASH_BITS)
> -#define SIDTAB_HASH_MASK (SIDTAB_HASH_BUCKETS-1)
> +/* align node size to page boundary */
> +#define SIDTAB_NODE_ALLOC_SHIFT PAGE_SHIFT
> +#define SIDTAB_NODE_ALLOC_SIZE  PAGE_SIZE
> +
> +#define size_to_shift(size) ((size) == 1 ? 1 : (const_ilog2((size) - 1) + 1))
> +
> +#define SIDTAB_INNER_SHIFT \
> +	(SIDTAB_NODE_ALLOC_SHIFT - size_to_shift(sizeof(union sidtab_entry_inner)))
>   
> -#define SIDTAB_SIZE SIDTAB_HASH_BUCKETS
> +#define SIDTAB_LEAF_ENTRIES  (PAGE_SIZE / sizeof(struct sidtab_entry_leaf))
> +#define SIDTAB_INNER_ENTRIES ((size_t)1 << SIDTAB_INNER_SHIFT)
> +
> +#define SIDTAB_MAX_BITS 31 /* limited to INT_MAX due to atomic_t range */
> +#define SIDTAB_MAX (((u32)1 << SIDTAB_MAX_BITS) - 1)
> +/* ensure enough tree levels for SIDTAB_MAX entries */
> +#define SIDTAB_MAX_LEVEL \
> +	DIV_ROUND_UP(SIDTAB_MAX_BITS - size_to_shift(SIDTAB_LEAF_ENTRIES), \
> +		     SIDTAB_INNER_SHIFT)
> +
> +struct sidtab_node_leaf {
> +	struct sidtab_entry_leaf entries[SIDTAB_LEAF_ENTRIES];
> +};
> +
> +struct sidtab_node_inner {
> +	union sidtab_entry_inner entries[SIDTAB_INNER_ENTRIES];
> +};
>   
>   struct sidtab_isid_entry {
>   	int set;
>   	struct context context;
>   };
>   
> +struct sidtab_convert_params {
> +	int (*func)(struct context *oldc, struct context *newc, void *args);
> +	void *args;
> +	struct sidtab *target;
> +};
> +
>   struct sidtab {
> -	struct sidtab_node **htable;
> -	unsigned int nel;	/* number of elements */
> -	unsigned int next_sid;	/* next SID to allocate */
> -	unsigned char shutdown;
> -#define SIDTAB_CACHE_LEN	3
> -	struct sidtab_node *cache[SIDTAB_CACHE_LEN];
> +	union sidtab_entry_inner roots[SIDTAB_MAX_LEVEL + 1];
> +	atomic_t count;
> +	struct sidtab_convert_params *convert;
>   	spinlock_t lock;
>   
>   	/* index == SID - 1 (no entry for SECSID_NULL) */
> @@ -45,15 +81,10 @@ int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context);
>   struct context *sidtab_search(struct sidtab *s, u32 sid);
>   struct context *sidtab_search_force(struct sidtab *s, u32 sid);
>   
> -int sidtab_convert(struct sidtab *s, struct sidtab *news,
> -		   int (*apply)(u32 sid,
> -				struct context *context,
> -				void *args),
> -		   void *args);
> +int sidtab_convert(struct sidtab *s, struct sidtab_convert_params *params);
>   
>   int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid);
>   
> -void sidtab_hash_eval(struct sidtab *h, char *tag);
>   void sidtab_destroy(struct sidtab *s);
>   
>   #endif	/* _SS_SIDTAB_H_ */
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v2 2/4] [squash] do not store entry for SECSID_NULL
  2018-11-27 17:00   ` Stephen Smalley
  2018-11-27 17:14     ` Stephen Smalley
@ 2018-11-27 19:45     ` Ondrej Mosnacek
  2018-11-28 12:07       ` Ondrej Mosnacek
  1 sibling, 1 reply; 11+ messages in thread
From: Ondrej Mosnacek @ 2018-11-27 19:45 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: selinux, Paul Moore

On Tue, Nov 27, 2018 at 5:58 PM Stephen Smalley <sds@tycho.nsa.gov> wrote:
> On 11/27/18 5:36 AM, Ondrej Mosnacek wrote:
> > This patch is kept separate only for review. Eventually it will be
> > folded into the previous patch.
>
> This one triggers a lot of warnings (security_compute_av: unrecognized
> SID 0, security_sid_to_context_core: unrecognized SID 0) and some
> failures during selinux-testsuite inet_socket tests.  While the policy
> doesn't provide an entry for SECSID_NULL, the sidtab search logic was
> remapping it to the unlabeled context and that was apparently being
> relied upon by the labeled networking code IIUC.

You're right, I made a mistake in the sidtab_search_core() function -
it shouldn't just return NULL when sid == 0, but instead skip to the
default-to-unlabeled fallback. This will be easy to fix.

Thanks for testing!

I wonder why I didn't get any inet_socket failures when running the
testsuite myself... I will have to look at it closer tomorrow.

>
>
> >
> > Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> > ---
> >   security/selinux/ss/policydb.c |  2 +-
> >   security/selinux/ss/sidtab.c   | 25 ++++++++++++++++---------
> >   security/selinux/ss/sidtab.h   |  3 ++-
> >   3 files changed, 19 insertions(+), 11 deletions(-)
> >
> > diff --git a/security/selinux/ss/policydb.c b/security/selinux/ss/policydb.c
> > index 59359fa0bd74..a50d625e7946 100644
> > --- a/security/selinux/ss/policydb.c
> > +++ b/security/selinux/ss/policydb.c
> > @@ -912,7 +912,7 @@ int policydb_load_isids(struct policydb *p, struct sidtab *s)
> >                       sidtab_destroy(s);
> >                       goto out;
> >               }
> > -             if (c->sid[0] > SECINITSID_NUM) {
> > +             if (c->sid[0] == SECSID_NULL || c->sid[0] > SECINITSID_NUM) {
> >                       pr_err("SELinux:  Initial SID %s out of range.\n",
> >                               c->u.name);
> >                       sidtab_destroy(s);
> > diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
> > index fd8115b211a6..e157d8240cf1 100644
> > --- a/security/selinux/ss/sidtab.c
> > +++ b/security/selinux/ss/sidtab.c
> > @@ -23,7 +23,7 @@ int sidtab_init(struct sidtab *s)
> >       if (!s->htable)
> >               return -ENOMEM;
> >
> > -     for (i = 0; i <= SECINITSID_NUM; i++)
> > +     for (i = 0; i < SECINITSID_NUM; i++)
> >               s->isids[i].set = 0;
> >
> >       for (i = 0; i < SIDTAB_SIZE; i++)
> > @@ -86,8 +86,15 @@ static int sidtab_insert(struct sidtab *s, u32 sid, struct context *context)
> >
> >   int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context)
> >   {
> > -     struct sidtab_isid_entry *entry = &s->isids[sid];
> > -     int rc = context_cpy(&entry->context, context);
> > +     struct sidtab_isid_entry *entry;
> > +     int rc;
> > +
> > +     if (sid == 0 || sid > SECINITSID_NUM)
> > +             return -EINVAL;
> > +
> > +     entry = &s->isids[sid - 1];
> > +
> > +     rc = context_cpy(&entry->context, context);
> >       if (rc)
> >               return rc;
> >
> > @@ -116,19 +123,19 @@ static struct context *sidtab_search_core(struct sidtab *s, u32 sid, int force)
> >       struct context *context;
> >       struct sidtab_isid_entry *entry;
> >
> > -     if (!s)
> > +     if (!s || sid == 0)
> >               return NULL;
> >
> >       if (sid > SECINITSID_NUM) {
> >               context = sidtab_lookup(s, sid - (SECINITSID_NUM + 1));
> >       } else {
> > -             entry = &s->isids[sid];
> > +             entry = &s->isids[sid - 1];
> >               context = entry->set ? &entry->context : NULL;
> >       }
> >       if (context && (!context->len || force))
> >               return context;
> >
> > -     entry = &s->isids[SECINITSID_UNLABELED];
> > +     entry = &s->isids[SECINITSID_UNLABELED - 1];
> >       return entry->set ? &entry->context : NULL;
> >   }
> >
> > @@ -283,11 +290,11 @@ int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid)
> >       int rc;
> >       u32 i;
> >
> > -     for (i = 0; i <= SECINITSID_NUM; i++) {
> > +     for (i = 0; i < SECINITSID_NUM; i++) {
> >               struct sidtab_isid_entry *entry = &s->isids[i];
> >
> >               if (entry->set && context_cmp(context, &entry->context)) {
> > -                     *sid = i;
> > +                     *sid = i + 1;
> >                       return 0;
> >               }
> >       }
> > @@ -334,7 +341,7 @@ void sidtab_destroy(struct sidtab *s)
> >       if (!s)
> >               return;
> >
> > -     for (i = 0; i <= SECINITSID_NUM; i++)
> > +     for (i = 0; i < SECINITSID_NUM; i++)
> >               if (s->isids[i].set)
> >                       context_destroy(&s->isids[i].context);
> >
> > diff --git a/security/selinux/ss/sidtab.h b/security/selinux/ss/sidtab.h
> > index dc0a80bc8894..e657ae6bf996 100644
> > --- a/security/selinux/ss/sidtab.h
> > +++ b/security/selinux/ss/sidtab.h
> > @@ -36,7 +36,8 @@ struct sidtab {
> >       struct sidtab_node *cache[SIDTAB_CACHE_LEN];
> >       spinlock_t lock;
> >
> > -     struct sidtab_isid_entry isids[SECINITSID_NUM + 1];
> > +     /* index == SID - 1 (no entry for SECSID_NULL) */
> > +     struct sidtab_isid_entry isids[SECINITSID_NUM];
> >   };
> >
> >   int sidtab_init(struct sidtab *s);
> >
>

--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v2 3/4] selinux: overhaul sidtab to fix bug and improve performance
  2018-11-27 19:41   ` Stephen Smalley
@ 2018-11-27 20:03     ` Ondrej Mosnacek
  0 siblings, 0 replies; 11+ messages in thread
From: Ondrej Mosnacek @ 2018-11-27 20:03 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: selinux, Paul Moore

On Tue, Nov 27, 2018 at 8:38 PM Stephen Smalley <sds@tycho.nsa.gov> wrote:
> On 11/27/18 5:36 AM, Ondrej Mosnacek wrote:
> > Before this patch, during a policy reload the sidtab would become frozen
> > and trying to map a new context to SID would be unable to add a new
> > entry to sidtab and fail with -ENOMEM.
> >
> > Such failures are usually propagated into userspace, which has no way of
> > distignuishing them from actual allocation failures and thus doesn't
> > handle them gracefully. Such situation can be triggered e.g. by the
> > following reproducer:
> >
> >      while true; do load_policy; echo -n .; sleep 0.1; done &
> >      for (( i = 0; i < 1024; i++ )); do
> >          runcon -l s0:c$i echo -n x || break
> >          # or:
> >          # chcon -l s0:c$i <some_file> || break
> >      done
> >
> > This patch overhauls the sidtab so it doesn't need to be frozen during
> > policy reload, thus solving the above problem.
> >
> > The new SID table leverages the fact that SIDs are allocated
> > sequentially and are never invalidated and stores them in linear buckets
> > indexed by a tree structure. This brings several advantages:
> >    1. Fast SID -> context lookup - this lookup can now be done in
> >       logarithmic time complexity (usually in less than 4 array lookups)
> >       and can still be done safely without locking.
> >    2. No need to re-search the whole table on reverse lookup miss - after
> >       acquiring the spinlock only the newly added entries need to be
> >       searched, which means that reverse lookups that end up inserting a
> >       new entry are now about twice as fast.
> >    3. No need to freeze sidtab during policy reload - it is now possible
> >       to handle insertion of new entries even during sidtab conversion.
> >
> > The tree structure of the new sidtab is able to grow automatically to up
> > to about 2^31 entries (at which point it should not have more than about
> > 4 tree levels). The old sidtab had a theoretical capacity of almost 2^32
> > entries, but half of that is still more than enough since by that point
> > the reverse table lookups would become unusably slow anyway...
> >
> > The number of entries per tree node is selected automatically so that
> > each node fits into a single page, which should be the easiest size for
> > kmalloc() to handle.
> >
> > Note that the new implementation does not have a cache for recent
> > reverse lookups as the old one had. Such cache is, however, added in a
> > subsequent patch.
> >
> > Tested by selinux-testsuite and thoroughly tortured by this simple
> > stress test:
> > ```
> > function rand_cat() {
> >       echo $(( $RANDOM % 1024 ))
> > }
> >
> > function do_work() {
> >       while true; do
> >               echo -n "system_u:system_r:kernel_t:s0:c$(rand_cat),c$(rand_cat)" \
> >                       >/sys/fs/selinux/context 2>/dev/null || true
> >       done
> > }
> >
> > do_work >/dev/null &
> > do_work >/dev/null &
> > do_work >/dev/null &
> >
> > while load_policy; do echo -n .; sleep 0.1; done
> >
> > kill %1
> > kill %2
> > kill %3
> > ```
> >
> > Reported-by: Orion Poplawski <orion@nwra.com>
> > Reported-by: Li Kun <hw.likun@huawei.com>
> > Link: https://github.com/SELinuxProject/selinux-kernel/issues/38
> > Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> > ---
> >   security/selinux/ss/mls.c      |  23 +-
> >   security/selinux/ss/mls.h      |   3 +-
> >   security/selinux/ss/services.c |  90 +++---
> >   security/selinux/ss/sidtab.c   | 522 +++++++++++++++++++--------------
> >   security/selinux/ss/sidtab.h   |  75 +++--
> >   5 files changed, 402 insertions(+), 311 deletions(-)
> >
> > diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c
> > index 2fe459df3c85..267ae4f9be79 100644
> > --- a/security/selinux/ss/mls.c
> > +++ b/security/selinux/ss/mls.c
> > @@ -436,16 +436,17 @@ int mls_setup_user_range(struct policydb *p,
> >
> >   /*
> >    * Convert the MLS fields in the security context
> > - * structure `c' from the values specified in the
> > - * policy `oldp' to the values specified in the policy `newp'.
> > + * structure `oldc' from the values specified in the
> > + * policy `oldp' to the values specified in the policy `newp',
> > + * storing the resulting context in `newc'.
> >    */
> >   int mls_convert_context(struct policydb *oldp,
> >                       struct policydb *newp,
> > -                     struct context *c)
> > +                     struct context *oldc,
> > +                     struct context *newc)
> >   {
> >       struct level_datum *levdatum;
> >       struct cat_datum *catdatum;
> > -     struct ebitmap bitmap;
> >       struct ebitmap_node *node;
> >       int l, i;
> >
> > @@ -455,28 +456,24 @@ int mls_convert_context(struct policydb *oldp,
> >       for (l = 0; l < 2; l++) {
> >               levdatum = hashtab_search(newp->p_levels.table,
> >                                         sym_name(oldp, SYM_LEVELS,
> > -                                                c->range.level[l].sens - 1));
> > +                                                oldc->range.level[l].sens - 1));
> >
> >               if (!levdatum)
> >                       return -EINVAL;
> > -             c->range.level[l].sens = levdatum->level->sens;
> > +             newc->range.level[l].sens = levdatum->level->sens;
> >
> > -             ebitmap_init(&bitmap);
> > -             ebitmap_for_each_positive_bit(&c->range.level[l].cat, node, i) {
> > +             ebitmap_for_each_positive_bit(&oldc->range.level[l].cat, node, i) {
> >                       int rc;
> >
> >                       catdatum = hashtab_search(newp->p_cats.table,
> >                                                 sym_name(oldp, SYM_CATS, i));
> >                       if (!catdatum)
> >                               return -EINVAL;
> > -                     rc = ebitmap_set_bit(&bitmap, catdatum->value - 1, 1);
> > +                     rc = ebitmap_set_bit(&newc->range.level[l].cat,
> > +                                          catdatum->value - 1, 1);
> >                       if (rc)
> >                               return rc;
> > -
> > -                     cond_resched();
> >               }
> > -             ebitmap_destroy(&c->range.level[l].cat);
> > -             c->range.level[l].cat = bitmap;
> >       }
> >
> >       return 0;
> > diff --git a/security/selinux/ss/mls.h b/security/selinux/ss/mls.h
> > index 67093647576d..7954b1e60b64 100644
> > --- a/security/selinux/ss/mls.h
> > +++ b/security/selinux/ss/mls.h
> > @@ -46,7 +46,8 @@ int mls_range_set(struct context *context, struct mls_range *range);
> >
> >   int mls_convert_context(struct policydb *oldp,
> >                       struct policydb *newp,
> > -                     struct context *context);
> > +                     struct context *oldc,
> > +                     struct context *newc);
> >
> >   int mls_compute_sid(struct policydb *p,
> >                   struct context *scontext,
> > diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
> > index 30170d4c567a..3c5887838e12 100644
> > --- a/security/selinux/ss/services.c
> > +++ b/security/selinux/ss/services.c
> > @@ -1907,19 +1907,16 @@ struct convert_context_args {
> >
> >   /*
> >    * Convert the values in the security context
> > - * structure `c' from the values specified
> > + * structure `oldc' from the values specified
> >    * in the policy `p->oldp' to the values specified
> > - * in the policy `p->newp'.  Verify that the
> > - * context is valid under the new policy.
> > + * in the policy `p->newp', storing the new context
> > + * in `newc'.  Verify that the context is valid
> > + * under the new policy.
> >    */
> > -static int convert_context(u32 key,
> > -                        struct context *c,
> > -                        void *p)
> > +static int convert_context(struct context *oldc, struct context *newc, void *p)
> >   {
> >       struct convert_context_args *args;
> > -     struct context oldc;
> >       struct ocontext *oc;
> > -     struct mls_range *range;
> >       struct role_datum *role;
> >       struct type_datum *typdatum;
> >       struct user_datum *usrdatum;
> > @@ -1929,23 +1926,18 @@ static int convert_context(u32 key,
> >
> >       args = p;
> >
> > -     if (c->str) {
> > -             struct context ctx;
> > -
> > +     if (oldc->str) {
> >               rc = -ENOMEM;
> > -             s = kstrdup(c->str, GFP_KERNEL);
> > +             s = kstrdup(oldc->str, GFP_KERNEL);
> >               if (!s)
> >                       goto out;
> >
> >               rc = string_to_context_struct(args->newp, NULL, s,
> > -                                           &ctx, SECSID_NULL);
> > +                                           newc, SECSID_NULL);
> >               kfree(s);
> >               if (!rc) {
> >                       pr_info("SELinux:  Context %s became valid (mapped).\n",
> > -                            c->str);
> > -                     /* Replace string with mapped representation. */
> > -                     kfree(c->str);
> > -                     memcpy(c, &ctx, sizeof(*c));
> > +                             oldc->str);
> >                       goto out;
> >               } else if (rc == -EINVAL) {
> >                       /* Retain string representation for later mapping. */
> I haven't looked closely at the new sidtab implementation yet, but I was
> encountering KASAN failures in testing and noticed that they seemed to
> be tied to invalid/unmapped contexts in the filesystem.
>
> Reproducer:
> $ su
> # setenforce 0
> # touch foo
> # setfattr -n security.selinux -v thisisnotavalidcontext foo
> # ls -Z foo
> # load_policy
> # ls -Z foo
>
> The issue in this case seems to be that you aren't actually retaining
> the string representation of the invalid/unmapped context across a
> reload after this patch because you never copy it to newc, e.g. replace
> the rc = 0; line with rc = context_cpy(newc, oldc); in the EINVAL case
> above.  Otherwise, you'll be left with an uninitialized newc.
> Previously we didn't have to do anything here because we were just
> leaving c (oldc) intact.

Indeed, thanks for spotting that! I will have a closer look tomorrow.

>
> > @@ -1954,51 +1946,42 @@ static int convert_context(u32 key,
> >               } else {
> >                       /* Other error condition, e.g. ENOMEM. */
> >                       pr_err("SELinux:   Unable to map context %s, rc = %d.\n",
> > -                            c->str, -rc);
> > +                            oldc->str, -rc);
> >                       goto out;
> >               }
> >       }
> >
> > -     rc = context_cpy(&oldc, c);
> > -     if (rc)
> > -             goto out;
> > +     context_init(newc);
> >
> >       /* Convert the user. */
> >       rc = -EINVAL;
> >       usrdatum = hashtab_search(args->newp->p_users.table,
> > -                               sym_name(args->oldp, SYM_USERS, c->user - 1));
> > +                               sym_name(args->oldp, SYM_USERS, oldc->user - 1));
> >       if (!usrdatum)
> >               goto bad;
> > -     c->user = usrdatum->value;
> > +     newc->user = usrdatum->value;
> >
> >       /* Convert the role. */
> >       rc = -EINVAL;
> >       role = hashtab_search(args->newp->p_roles.table,
> > -                           sym_name(args->oldp, SYM_ROLES, c->role - 1));
> > +                           sym_name(args->oldp, SYM_ROLES, oldc->role - 1));
> >       if (!role)
> >               goto bad;
> > -     c->role = role->value;
> > +     newc->role = role->value;
> >
> >       /* Convert the type. */
> >       rc = -EINVAL;
> >       typdatum = hashtab_search(args->newp->p_types.table,
> > -                               sym_name(args->oldp, SYM_TYPES, c->type - 1));
> > +                               sym_name(args->oldp, SYM_TYPES, oldc->type - 1));
> >       if (!typdatum)
> >               goto bad;
> > -     c->type = typdatum->value;
> > +     newc->type = typdatum->value;
> >
> >       /* Convert the MLS fields if dealing with MLS policies */
> >       if (args->oldp->mls_enabled && args->newp->mls_enabled) {
> > -             rc = mls_convert_context(args->oldp, args->newp, c);
> > +             rc = mls_convert_context(args->oldp, args->newp, oldc, newc);
> >               if (rc)
> >                       goto bad;
> > -     } else if (args->oldp->mls_enabled && !args->newp->mls_enabled) {
> > -             /*
> > -              * Switching between MLS and non-MLS policy:
> > -              * free any storage used by the MLS fields in the
> > -              * context for all existing entries in the sidtab.
> > -              */
> > -             mls_context_destroy(c);
> >       } else if (!args->oldp->mls_enabled && args->newp->mls_enabled) {
> >               /*
> >                * Switching between non-MLS and MLS policy:
> > @@ -2016,36 +1999,31 @@ static int convert_context(u32 key,
> >                               " the initial SIDs list\n");
> >                       goto bad;
> >               }
> > -             range = &oc->context[0].range;
> > -             rc = mls_range_set(c, range);
> > +             rc = mls_range_set(newc, &oc->context[0].range);
> >               if (rc)
> >                       goto bad;
> >       }
> >
> >       /* Check the validity of the new context. */
> > -     if (!policydb_context_isvalid(args->newp, c)) {
> > -             rc = convert_context_handle_invalid_context(args->state,
> > -                                                         &oldc);
> > +     if (!policydb_context_isvalid(args->newp, newc)) {
> > +             rc = convert_context_handle_invalid_context(args->state, oldc);
> >               if (rc)
> >                       goto bad;
> >       }
> >
> > -     context_destroy(&oldc);
> > -
> >       rc = 0;
> >   out:
> >       return rc;
> >   bad:
> >       /* Map old representation to string and save it. */
> > -     rc = context_struct_to_string(args->oldp, &oldc, &s, &len);
> > +     rc = context_struct_to_string(args->oldp, oldc, &s, &len);
> >       if (rc)
> >               return rc;
> > -     context_destroy(&oldc);
> > -     context_destroy(c);
> > -     c->str = s;
> > -     c->len = len;
> > +     context_destroy(newc);
> > +     newc->str = s;
> > +     newc->len = len;
> >       pr_info("SELinux:  Context %s became invalid (unmapped).\n",
> > -            c->str);
> > +             newc->str);
> >       rc = 0;
> >       goto out;
> >   }
> > @@ -2091,6 +2069,7 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
> >       struct policydb *oldpolicydb, *newpolicydb;
> >       struct selinux_mapping *oldmapping;
> >       struct selinux_map newmap;
> > +     struct sidtab_convert_params convert_params;
> >       struct convert_context_args args;
> >       u32 seqno;
> >       int rc = 0;
> > @@ -2147,12 +2126,6 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
> >               goto out;
> >       }
> >
> > -     oldsidtab = state->ss->sidtab;
> > -
> > -#if 0
> > -     sidtab_hash_eval(oldsidtab, "sids");
> > -#endif
> > -
> >       rc = policydb_read(newpolicydb, fp);
> >       if (rc) {
> >               kfree(newsidtab);
> > @@ -2184,6 +2157,8 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
> >               goto err;
> >       }
> >
> > +     oldsidtab = state->ss->sidtab;
> > +
> >       /*
> >        * Convert the internal representations of contexts
> >        * in the new SID table.
> > @@ -2191,7 +2166,12 @@ int security_load_policy(struct selinux_state *state, void *data, size_t len)
> >       args.state = state;
> >       args.oldp = policydb;
> >       args.newp = newpolicydb;
> > -     rc = sidtab_convert(oldsidtab, newsidtab, convert_context, &args);
> > +
> > +     convert_params.func = convert_context;
> > +     convert_params.args = &args;
> > +     convert_params.target = newsidtab;
> > +
> > +     rc = sidtab_convert(oldsidtab, &convert_params);
> >       if (rc) {
> >               pr_err("SELinux:  unable to convert the internal"
> >                       " representation of contexts in the new SID"
> > diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
> > index e157d8240cf1..a82deecfac09 100644
> > --- a/security/selinux/ss/sidtab.c
> > +++ b/security/selinux/ss/sidtab.c
> > @@ -2,88 +2,38 @@
> >   /*
> >    * Implementation of the SID table type.
> >    *
> > - * Author : Stephen Smalley, <sds@tycho.nsa.gov>
> > + * Original author: Stephen Smalley, <sds@tycho.nsa.gov>
> > + * Author: Ondrej Mosnacek, <omosnacek@gmail.com>
> > + *
> > + * Copyright (C) 2018 Red Hat, Inc.
> >    */
> > +#include <linux/errno.h>
> >   #include <linux/kernel.h>
> >   #include <linux/slab.h>
> > +#include <linux/sched.h>
> >   #include <linux/spinlock.h>
> > -#include <linux/errno.h>
> > +#include <linux/atomic.h>
> >   #include "flask.h"
> >   #include "security.h"
> >   #include "sidtab.h"
> >
> > -#define SIDTAB_HASH(sid) \
> > -(sid & SIDTAB_HASH_MASK)
> > -
> >   int sidtab_init(struct sidtab *s)
> >   {
> > -     int i;
> > +     u32 i;
> >
> > -     s->htable = kmalloc_array(SIDTAB_SIZE, sizeof(*s->htable), GFP_ATOMIC);
> > -     if (!s->htable)
> > -             return -ENOMEM;
> > +     memset(s->roots, 0, sizeof(s->roots));
> >
> >       for (i = 0; i < SECINITSID_NUM; i++)
> >               s->isids[i].set = 0;
> >
> > -     for (i = 0; i < SIDTAB_SIZE; i++)
> > -             s->htable[i] = NULL;
> > +     atomic_set(&s->count, 0);
> >
> > -     for (i = 0; i < SIDTAB_CACHE_LEN; i++)
> > -             s->cache[i] = NULL;
> > +     s->convert = NULL;
> >
> > -     s->nel = 0;
> > -     s->next_sid = 0;
> > -     s->shutdown = 0;
> >       spin_lock_init(&s->lock);
> >       return 0;
> >   }
> >
> > -static int sidtab_insert(struct sidtab *s, u32 sid, struct context *context)
> > -{
> > -     int hvalue;
> > -     struct sidtab_node *prev, *cur, *newnode;
> > -
> > -     if (!s)
> > -             return -ENOMEM;
> > -
> > -     hvalue = SIDTAB_HASH(sid);
> > -     prev = NULL;
> > -     cur = s->htable[hvalue];
> > -     while (cur && sid > cur->sid) {
> > -             prev = cur;
> > -             cur = cur->next;
> > -     }
> > -
> > -     if (cur && sid == cur->sid)
> > -             return -EEXIST;
> > -
> > -     newnode = kmalloc(sizeof(*newnode), GFP_ATOMIC);
> > -     if (!newnode)
> > -             return -ENOMEM;
> > -
> > -     newnode->sid = sid;
> > -     if (context_cpy(&newnode->context, context)) {
> > -             kfree(newnode);
> > -             return -ENOMEM;
> > -     }
> > -
> > -     if (prev) {
> > -             newnode->next = prev->next;
> > -             wmb();
> > -             prev->next = newnode;
> > -     } else {
> > -             newnode->next = s->htable[hvalue];
> > -             wmb();
> > -             s->htable[hvalue] = newnode;
> > -     }
> > -
> > -     s->nel++;
> > -     if (sid >= s->next_sid)
> > -             s->next_sid = sid + 1;
> > -     return 0;
> > -}
> > -
> >   int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context)
> >   {
> >       struct sidtab_isid_entry *entry;
> > @@ -102,20 +52,90 @@ int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context)
> >       return 0;
> >   }
> >
> > -static struct context *sidtab_lookup(struct sidtab *s, u32 sid)
> > +static u32 sidtab_level_from_count(u32 count)
> >   {
> > -     int hvalue;
> > -     struct sidtab_node *cur;
> > +     u32 capacity = SIDTAB_LEAF_ENTRIES;
> > +     u32 level = 0;
> >
> > -     hvalue = SIDTAB_HASH(sid);
> > -     cur = s->htable[hvalue];
> > -     while (cur && sid > cur->sid)
> > -             cur = cur->next;
> > +     while (count > capacity) {
> > +             capacity <<= SIDTAB_INNER_SHIFT;
> > +             ++level;
> > +     }
> > +     return level;
> > +}
> > +
> > +static int sidtab_alloc_roots(struct sidtab *s, u32 level)
> > +{
> > +     u32 l;
> > +
> > +     if (!s->roots[0].ptr_leaf) {
> > +             s->roots[0].ptr_leaf = kzalloc(SIDTAB_NODE_ALLOC_SIZE,
> > +                                            GFP_ATOMIC);
> > +             if (!s->roots[0].ptr_leaf)
> > +                     return -ENOMEM;
> > +     }
> > +     for (l = 1; l <= level; ++l)
> > +             if (!s->roots[l].ptr_inner) {
> > +                     s->roots[l].ptr_inner = kzalloc(SIDTAB_NODE_ALLOC_SIZE,
> > +                                                     GFP_ATOMIC);
> > +                     if (!s->roots[l].ptr_inner)
> > +                             return -ENOMEM;
> > +                     s->roots[l].ptr_inner->entries[0] = s->roots[l - 1];
> > +             }
> > +     return 0;
> > +}
> >
> > -     if (!cur || sid != cur->sid)
> > +static struct context *sidtab_do_lookup(struct sidtab *s, u32 index, int alloc)
> > +{
> > +     union sidtab_entry_inner *entry;
> > +     u32 level, capacity_shift, leaf_index = index / SIDTAB_LEAF_ENTRIES;
> > +
> > +     /* find the level of the subtree we need */
> > +     level = sidtab_level_from_count(index + 1);
> > +     capacity_shift = level * SIDTAB_INNER_SHIFT;
> > +
> > +     /* allocate roots if needed */
> > +     if (alloc && sidtab_alloc_roots(s, level) != 0)
> >               return NULL;
> >
> > -     return &cur->context;
> > +     /* lookup inside the subtree */
> > +     entry = &s->roots[level];
> > +     while (level != 0) {
> > +             capacity_shift -= SIDTAB_INNER_SHIFT;
> > +             --level;
> > +
> > +             entry = &entry->ptr_inner->entries[leaf_index >> capacity_shift];
> > +             leaf_index &= ((u32)1 << capacity_shift) - 1;
> > +
> > +             if (!entry->ptr_inner) {
> > +                     if (alloc)
> > +                             entry->ptr_inner = kzalloc(SIDTAB_NODE_ALLOC_SIZE,
> > +                                                        GFP_ATOMIC);
> > +                     if (!entry->ptr_inner)
> > +                             return NULL;
> > +             }
> > +     }
> > +     if (!entry->ptr_leaf) {
> > +             if (alloc)
> > +                     entry->ptr_leaf = kzalloc(SIDTAB_NODE_ALLOC_SIZE,
> > +                                               GFP_ATOMIC);
> > +             if (!entry->ptr_leaf)
> > +                     return NULL;
> > +     }
> > +     return &entry->ptr_leaf->entries[index % SIDTAB_LEAF_ENTRIES].context;
> > +}
> > +
> > +static struct context *sidtab_lookup(struct sidtab *s, u32 index)
> > +{
> > +     u32 count = (u32)atomic_read(&s->count);
> > +
> > +     if (index >= count)
> > +             return NULL;
> > +
> > +     /* read entries after reading count */
> > +     smp_rmb();
> > +
> > +     return sidtab_do_lookup(s, index, 0);
> >   }
> >
> >   static struct context *sidtab_search_core(struct sidtab *s, u32 sid, int force)
> > @@ -123,7 +143,7 @@ static struct context *sidtab_search_core(struct sidtab *s, u32 sid, int force)
> >       struct context *context;
> >       struct sidtab_isid_entry *entry;
> >
> > -     if (!s || sid == 0)
> > +     if (sid == 0)
> >               return NULL;
> >
> >       if (sid > SECINITSID_NUM) {
> > @@ -149,140 +169,126 @@ struct context *sidtab_search_force(struct sidtab *s, u32 sid)
> >       return sidtab_search_core(s, sid, 1);
> >   }
> >
> > -static int sidtab_map(struct sidtab *s,
> > -                   int (*apply)(u32 sid,
> > -                                struct context *context,
> > -                                void *args),
> > -                   void *args)
> > +static int sidtab_find_context(union sidtab_entry_inner entry,
> > +                            u32 *pos, u32 count, u32 level,
> > +                            struct context *context, u32 *index)
> >   {
> > -     int i, rc = 0;
> > -     struct sidtab_node *cur;
> > +     int rc;
> > +     u32 i;
> >
> > -     if (!s)
> > -             goto out;
> > +     if (level != 0) {
> > +             struct sidtab_node_inner *node = entry.ptr_inner;
> >
> > -     for (i = 0; i < SIDTAB_SIZE; i++) {
> > -             cur = s->htable[i];
> > -             while (cur) {
> > -                     rc = apply(cur->sid, &cur->context, args);
> > -                     if (rc)
> > -                             goto out;
> > -                     cur = cur->next;
> > +             i = 0;
> > +             while (i < SIDTAB_INNER_ENTRIES && *pos < count) {
> > +                     rc = sidtab_find_context(node->entries[i],
> > +                                              pos, count, level - 1,
> > +                                              context, index);
> > +                     if (rc == 0)
> > +                             return 0;
> > +                     i++;
> >               }
> > -     }
> > -out:
> > -     return rc;
> > -}
> > +     } else {
> > +             struct sidtab_node_leaf *node = entry.ptr_leaf;
> >
> > -/* Clone the SID into the new SID table. */
> > -static int clone_sid(u32 sid, struct context *context, void *arg)
> > -{
> > -     struct sidtab *s = arg;
> > -     return sidtab_insert(s, sid, context);
> > +             i = 0;
> > +             while (i < SIDTAB_LEAF_ENTRIES && *pos < count) {
> > +                     if (context_cmp(&node->entries[i].context, context)) {
> > +                             *index = *pos;
> > +                             return 0;
> > +                     }
> > +                     (*pos)++;
> > +                     i++;
> > +             }
> > +     }
> > +     return -ENOENT;
> >   }
> >
> > -int sidtab_convert(struct sidtab *s, struct sidtab *news,
> > -                int (*convert)(u32 sid,
> > -                               struct context *context,
> > -                               void *args),
> > -                void *args)
> > +static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
> > +                              u32 *index)
> >   {
> >       unsigned long flags;
> > +     u32 count = (u32)atomic_read(&s->count);
> > +     u32 count_locked, level, pos;
> > +     struct sidtab_convert_params *convert;
> > +     struct context *dst, *dst_convert;
> >       int rc;
> >
> > -     spin_lock_irqsave(&s->lock, flags);
> > -     s->shutdown = 1;
> > -     spin_unlock_irqrestore(&s->lock, flags);
> > +     level = sidtab_level_from_count(count);
> >
> > -     rc = sidtab_map(s, clone_sid, news);
> > -     if (rc)
> > -             return rc;
> > +     /* read entries after reading count */
> > +     smp_rmb();
> >
> > -     return sidtab_map(news, convert, args);
> > -}
> > +     pos = 0;
> > +     rc = sidtab_find_context(s->roots[level], &pos, count, level,
> > +                              context, index);
> > +     if (rc == 0)
> > +             return 0;
> >
> > -static void sidtab_update_cache(struct sidtab *s, struct sidtab_node *n, int loc)
> > -{
> > -     BUG_ON(loc >= SIDTAB_CACHE_LEN);
> > +     /* lock-free search failed: lock, re-search, and insert if not found */
> > +     spin_lock_irqsave(&s->lock, flags);
> >
> > -     while (loc > 0) {
> > -             s->cache[loc] = s->cache[loc - 1];
> > -             loc--;
> > -     }
> > -     s->cache[0] = n;
> > -}
> > +     convert = s->convert;
> > +     count_locked = (u32)atomic_read(&s->count);
> > +     level = sidtab_level_from_count(count_locked);
> >
> > -static inline int sidtab_search_context(struct sidtab *s,
> > -                                     struct context *context, u32 *sid)
> > -{
> > -     int i;
> > -     struct sidtab_node *cur;
> > -
> > -     for (i = 0; i < SIDTAB_SIZE; i++) {
> > -             cur = s->htable[i];
> > -             while (cur) {
> > -                     if (context_cmp(&cur->context, context)) {
> > -                             sidtab_update_cache(s, cur, SIDTAB_CACHE_LEN - 1);
> > -                             *sid = cur->sid;
> > -                             return 0;
> > -                     }
> > -                     cur = cur->next;
> > +     /* if count has changed before we acquired the lock, then catch up */
> > +     while (count < count_locked) {
> > +             if (context_cmp(sidtab_do_lookup(s, count, 0), context)) {
> > +                     rc = 0;
> > +                     *index = count;
> > +                     goto out_unlock;
> >               }
> > +             ++count;
> >       }
> > -     return -ENOENT;
> > -}
> >
> > -static inline int sidtab_search_cache(struct sidtab *s, struct context *context,
> > -                                   u32 *sid)
> > -{
> > -     int i;
> > -     struct sidtab_node *node;
> > -
> > -     for (i = 0; i < SIDTAB_CACHE_LEN; i++) {
> > -             node = s->cache[i];
> > -             if (unlikely(!node))
> > -                     return -ENOENT;
> > -             if (context_cmp(&node->context, context)) {
> > -                     sidtab_update_cache(s, node, i);
> > -                     *sid = node->sid;
> > -                     return 0;
> > -             }
> > -     }
> > -     return -ENOENT;
> > -}
> > +     /* insert context into new entry */
> > +     rc = -ENOMEM;
> > +     dst = sidtab_do_lookup(s, count, 1);
> > +     if (!dst)
> > +             goto out_unlock;
> >
> > -static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
> > -                              u32 *sid)
> > -{
> > -     int ret;
> > -     unsigned long flags;
> > +     rc = context_cpy(dst, context);
> > +     if (rc)
> > +             goto out_unlock;
> > +
> > +     /*
> > +      * if we are building a new sidtab, we need to convert the context
> > +      * and insert it there as well
> > +      */
> > +     if (convert) {
> > +             rc = -ENOMEM;
> > +             dst_convert = sidtab_do_lookup(convert->target, count, 1);
> > +             if (!dst_convert) {
> > +                     context_destroy(dst);
> > +                     goto out_unlock;
> > +             }
> >
> > -     ret = sidtab_search_cache(s, context, sid);
> > -     if (ret)
> > -             ret = sidtab_search_context(s, context, sid);
> > -     if (ret) {
> > -             spin_lock_irqsave(&s->lock, flags);
> > -             /* Rescan now that we hold the lock. */
> > -             ret = sidtab_search_context(s, context, sid);
> > -             if (!ret)
> > -                     goto unlock_out;
> > -             /* No SID exists for the context.  Allocate a new one. */
> > -             if (s->next_sid == (UINT_MAX - SECINITSID_NUM - 1) || s->shutdown) {
> > -                     ret = -ENOMEM;
> > -                     goto unlock_out;
> > +             rc = convert->func(context, dst_convert, convert->args);
> > +             if (rc) {
> > +                     context_destroy(dst);
> > +                     goto out_unlock;
> >               }
> > -             *sid = s->next_sid++;
> > -             if (context->len)
> > -                     pr_info("SELinux:  Context %s is not valid (left unmapped).\n",
> > -                            context->str);
> > -             ret = sidtab_insert(s, *sid, context);
> > -             if (ret)
> > -                     s->next_sid--;
> > -unlock_out:
> > -             spin_unlock_irqrestore(&s->lock, flags);
> > +
> > +             /* at this point we know the insert won't fail */
> > +             atomic_set(&convert->target->count, count + 1);
> >       }
> >
> > -     return ret;
> > +     if (context->len)
> > +             pr_info("SELinux:  Context %s is not valid (left unmapped).\n",
> > +                     context->str);
> > +
> > +     *index = count;
> > +
> > +     /* write entries before writing new count */
> > +     smp_wmb();
> > +
> > +     atomic_set(&s->count, count + 1);
> > +
> > +     rc = 0;
> > +out_unlock:
> > +     spin_unlock_irqrestore(&s->lock, flags);
> > +     return rc;
> >   }
> >
> >   int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid)
> > @@ -306,58 +312,134 @@ int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid)
> >       return 0;
> >   }
> >
> > -void sidtab_hash_eval(struct sidtab *h, char *tag)
> > +static int sidtab_convert_tree(union sidtab_entry_inner *edst,
> > +                            union sidtab_entry_inner *esrc,
> > +                            u32 *pos, u32 count, u32 level,
> > +                            struct sidtab_convert_params *convert)
> >   {
> > -     int i, chain_len, slots_used, max_chain_len;
> > -     struct sidtab_node *cur;
> > -
> > -     slots_used = 0;
> > -     max_chain_len = 0;
> > -     for (i = 0; i < SIDTAB_SIZE; i++) {
> > -             cur = h->htable[i];
> > -             if (cur) {
> > -                     slots_used++;
> > -                     chain_len = 0;
> > -                     while (cur) {
> > -                             chain_len++;
> > -                             cur = cur->next;
> > -                     }
> > +     int rc;
> > +     u32 i;
> >
> > -                     if (chain_len > max_chain_len)
> > -                             max_chain_len = chain_len;
> > +     if (level != 0) {
> > +             if (!edst->ptr_inner) {
> > +                     edst->ptr_inner = kzalloc(SIDTAB_NODE_ALLOC_SIZE, GFP_KERNEL);
> > +                     if (!edst->ptr_inner)
> > +                             return -ENOMEM;
> > +             }
> > +             i = 0;
> > +             while (i < SIDTAB_INNER_ENTRIES && *pos < count) {
> > +                     rc = sidtab_convert_tree(&edst->ptr_inner->entries[i],
> > +                                              &esrc->ptr_inner->entries[i],
> > +                                              pos, count, level - 1, convert);
> > +                     if (rc)
> > +                             return rc;
> > +                     i++;
> >               }
> > +     } else {
> > +             if (!edst->ptr_leaf) {
> > +                     edst->ptr_leaf = kzalloc(SIDTAB_NODE_ALLOC_SIZE, GFP_KERNEL);
> > +                     if (!edst->ptr_leaf)
> > +                             return -ENOMEM;
> > +             }
> > +             i = 0;
> > +             while (i < SIDTAB_LEAF_ENTRIES && *pos < count) {
> > +                     rc = convert->func(&esrc->ptr_leaf->entries[i].context,
> > +                                        &edst->ptr_leaf->entries[i].context,
> > +                                        convert->args);
> > +                     if (rc)
> > +                             return rc;
> > +                     (*pos)++;
> > +                     i++;
> > +             }
> > +             cond_resched();
> > +     }
> > +     return 0;
> > +}
> > +
> > +int sidtab_convert(struct sidtab *s, struct sidtab_convert_params *params)
> > +{
> > +     unsigned long flags;
> > +     u32 count, level, pos;
> > +     int rc;
> > +
> > +     spin_lock_irqsave(&s->lock, flags);
> > +
> > +     /* concurrent policy loads are not allowed */
> > +     if (s->convert) {
> > +             spin_unlock_irqrestore(&s->lock, flags);
> > +             return -EBUSY;
> > +     }
> > +
> > +     count = (u32)atomic_read(&s->count);
> > +     level = sidtab_level_from_count(count);
> > +
> > +     /* allocate last leaf in the new sidtab (to avoid race with live convert) */
> > +     rc = sidtab_do_lookup(params->target, count - 1, 1) ? 0 : -ENOMEM;
> > +     if (rc) {
> > +             spin_unlock_irqrestore(&s->lock, flags);
> > +             return rc;
> > +     }
> > +
> > +     /* set count in case no new entries are added during conversion */
> > +     atomic_set(&params->target->count, count);
> > +
> > +     /* enable live convert of new entries */
> > +     s->convert = params;
> > +
> > +     /* we can safely do the rest of the conversion outside the lock */
> > +     spin_unlock_irqrestore(&s->lock, flags);
> > +
> > +     pr_info("SELinux:  Converting %u SID table entries...\n", count);
> > +
> > +     /* convert all entries not covered by live convert */
> > +     pos = 0;
> > +     rc = sidtab_convert_tree(&params->target->roots[level], &s->roots[level],
> > +                              &pos, count, level, params);
> > +     if (rc) {
> > +             /* we need to keep the old table - disable live convert */
> > +             spin_lock_irqsave(&s->lock, flags);
> > +             s->convert = NULL;
> > +             spin_unlock_irqrestore(&s->lock, flags);
> >       }
> > +     return rc;
> > +}
> > +
> > +static void sidtab_destroy_tree(union sidtab_entry_inner entry, u32 level)
> > +{
> > +     u32 i;
> > +
> > +     if (level != 0) {
> > +             struct sidtab_node_inner *node = entry.ptr_inner;
> >
> > -     pr_debug("%s:  %d entries and %d/%d buckets used, longest "
> > -            "chain length %d\n", tag, h->nel, slots_used, SIDTAB_SIZE,
> > -            max_chain_len);
> > +             if (!node)
> > +                     return;
> > +
> > +             for (i = 0; i < SIDTAB_INNER_ENTRIES; i++)
> > +                     sidtab_destroy_tree(node->entries[i], level - 1);
> > +             kfree(node);
> > +     } else {
> > +             struct sidtab_node_leaf *node = entry.ptr_leaf;
> > +
> > +             if (!node)
> > +                     return;
> > +
> > +             for (i = 0; i < SIDTAB_LEAF_ENTRIES; i++)
> > +                     context_destroy(&node->entries[i].context);
> > +             kfree(node);
> > +     }
> >   }
> >
> >   void sidtab_destroy(struct sidtab *s)
> >   {
> > -     int i;
> > -     struct sidtab_node *cur, *temp;
> > -
> > -     if (!s)
> > -             return;
> > +     u32 i, level;
> >
> >       for (i = 0; i < SECINITSID_NUM; i++)
> >               if (s->isids[i].set)
> >                       context_destroy(&s->isids[i].context);
> >
> > +     level = SIDTAB_MAX_LEVEL;
> > +     while (level && !s->roots[level].ptr_inner)
> > +             --level;
> >
> > -     for (i = 0; i < SIDTAB_SIZE; i++) {
> > -             cur = s->htable[i];
> > -             while (cur) {
> > -                     temp = cur;
> > -                     cur = cur->next;
> > -                     context_destroy(&temp->context);
> > -                     kfree(temp);
> > -             }
> > -             s->htable[i] = NULL;
> > -     }
> > -     kfree(s->htable);
> > -     s->htable = NULL;
> > -     s->nel = 0;
> > -     s->next_sid = 1;
> > +     sidtab_destroy_tree(s->roots[level], level);
> >   }
> > diff --git a/security/selinux/ss/sidtab.h b/security/selinux/ss/sidtab.h
> > index e657ae6bf996..292512792a70 100644
> > --- a/security/selinux/ss/sidtab.h
> > +++ b/security/selinux/ss/sidtab.h
> > @@ -1,39 +1,75 @@
> >   /* SPDX-License-Identifier: GPL-2.0 */
> >   /*
> > - * A security identifier table (sidtab) is a hash table
> > + * A security identifier table (sidtab) is a lookup table
> >    * of security context structures indexed by SID value.
> >    *
> > - * Author : Stephen Smalley, <sds@tycho.nsa.gov>
> > + * Original author: Stephen Smalley, <sds@tycho.nsa.gov>
> > + * Author: Ondrej Mosnacek, <omosnacek@gmail.com>
> > + *
> > + * Copyright (C) 2018 Red Hat, Inc.
> >    */
> >   #ifndef _SS_SIDTAB_H_
> >   #define _SS_SIDTAB_H_
> >
> > +#include <linux/spinlock_types.h>
> > +#include <linux/log2.h>
> > +
> >   #include "context.h"
> >
> > -struct sidtab_node {
> > -     u32 sid;                /* security identifier */
> > -     struct context context; /* security context structure */
> > -     struct sidtab_node *next;
> > +struct sidtab_entry_leaf {
> > +     struct context context;
> > +};
> > +
> > +struct sidtab_node_inner;
> > +struct sidtab_node_leaf;
> > +
> > +union sidtab_entry_inner {
> > +     struct sidtab_node_inner *ptr_inner;
> > +     struct sidtab_node_leaf  *ptr_leaf;
> >   };
> >
> > -#define SIDTAB_HASH_BITS 7
> > -#define SIDTAB_HASH_BUCKETS (1 << SIDTAB_HASH_BITS)
> > -#define SIDTAB_HASH_MASK (SIDTAB_HASH_BUCKETS-1)
> > +/* align node size to page boundary */
> > +#define SIDTAB_NODE_ALLOC_SHIFT PAGE_SHIFT
> > +#define SIDTAB_NODE_ALLOC_SIZE  PAGE_SIZE
> > +
> > +#define size_to_shift(size) ((size) == 1 ? 1 : (const_ilog2((size) - 1) + 1))
> > +
> > +#define SIDTAB_INNER_SHIFT \
> > +     (SIDTAB_NODE_ALLOC_SHIFT - size_to_shift(sizeof(union sidtab_entry_inner)))
> >
> > -#define SIDTAB_SIZE SIDTAB_HASH_BUCKETS
> > +#define SIDTAB_LEAF_ENTRIES  (PAGE_SIZE / sizeof(struct sidtab_entry_leaf))
> > +#define SIDTAB_INNER_ENTRIES ((size_t)1 << SIDTAB_INNER_SHIFT)
> > +
> > +#define SIDTAB_MAX_BITS 31 /* limited to INT_MAX due to atomic_t range */
> > +#define SIDTAB_MAX (((u32)1 << SIDTAB_MAX_BITS) - 1)
> > +/* ensure enough tree levels for SIDTAB_MAX entries */
> > +#define SIDTAB_MAX_LEVEL \
> > +     DIV_ROUND_UP(SIDTAB_MAX_BITS - size_to_shift(SIDTAB_LEAF_ENTRIES), \
> > +                  SIDTAB_INNER_SHIFT)
> > +
> > +struct sidtab_node_leaf {
> > +     struct sidtab_entry_leaf entries[SIDTAB_LEAF_ENTRIES];
> > +};
> > +
> > +struct sidtab_node_inner {
> > +     union sidtab_entry_inner entries[SIDTAB_INNER_ENTRIES];
> > +};
> >
> >   struct sidtab_isid_entry {
> >       int set;
> >       struct context context;
> >   };
> >
> > +struct sidtab_convert_params {
> > +     int (*func)(struct context *oldc, struct context *newc, void *args);
> > +     void *args;
> > +     struct sidtab *target;
> > +};
> > +
> >   struct sidtab {
> > -     struct sidtab_node **htable;
> > -     unsigned int nel;       /* number of elements */
> > -     unsigned int next_sid;  /* next SID to allocate */
> > -     unsigned char shutdown;
> > -#define SIDTAB_CACHE_LEN     3
> > -     struct sidtab_node *cache[SIDTAB_CACHE_LEN];
> > +     union sidtab_entry_inner roots[SIDTAB_MAX_LEVEL + 1];
> > +     atomic_t count;
> > +     struct sidtab_convert_params *convert;
> >       spinlock_t lock;
> >
> >       /* index == SID - 1 (no entry for SECSID_NULL) */
> > @@ -45,15 +81,10 @@ int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context);
> >   struct context *sidtab_search(struct sidtab *s, u32 sid);
> >   struct context *sidtab_search_force(struct sidtab *s, u32 sid);
> >
> > -int sidtab_convert(struct sidtab *s, struct sidtab *news,
> > -                int (*apply)(u32 sid,
> > -                             struct context *context,
> > -                             void *args),
> > -                void *args);
> > +int sidtab_convert(struct sidtab *s, struct sidtab_convert_params *params);
> >
> >   int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid);
> >
> > -void sidtab_hash_eval(struct sidtab *h, char *tag);
> >   void sidtab_destroy(struct sidtab *s);
> >
> >   #endif      /* _SS_SIDTAB_H_ */
> >
>


-- 
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v2 2/4] [squash] do not store entry for SECSID_NULL
  2018-11-27 19:45     ` Ondrej Mosnacek
@ 2018-11-28 12:07       ` Ondrej Mosnacek
  0 siblings, 0 replies; 11+ messages in thread
From: Ondrej Mosnacek @ 2018-11-28 12:07 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: selinux, Paul Moore

On Tue, Nov 27, 2018 at 8:45 PM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> On Tue, Nov 27, 2018 at 5:58 PM Stephen Smalley <sds@tycho.nsa.gov> wrote:
> > On 11/27/18 5:36 AM, Ondrej Mosnacek wrote:
> > > This patch is kept separate only for review. Eventually it will be
> > > folded into the previous patch.
> >
> > This one triggers a lot of warnings (security_compute_av: unrecognized
> > SID 0, security_sid_to_context_core: unrecognized SID 0) and some
> > failures during selinux-testsuite inet_socket tests.  While the policy
> > doesn't provide an entry for SECSID_NULL, the sidtab search logic was
> > remapping it to the unlabeled context and that was apparently being
> > relied upon by the labeled networking code IIUC.
>
> You're right, I made a mistake in the sidtab_search_core() function -
> it shouldn't just return NULL when sid == 0, but instead skip to the
> default-to-unlabeled fallback. This will be easy to fix.
>
> Thanks for testing!
>
> I wonder why I didn't get any inet_socket failures when running the
> testsuite myself... I will have to look at it closer tomorrow.

Hmm... I must have been accidentally testing a wrong kernel build. I
am now able to reproduce both the failures and the hang.

I am now building a new kernel with this and the convert_context issues fixed.

>
> >
> >
> > >
> > > Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> > > ---
> > >   security/selinux/ss/policydb.c |  2 +-
> > >   security/selinux/ss/sidtab.c   | 25 ++++++++++++++++---------
> > >   security/selinux/ss/sidtab.h   |  3 ++-
> > >   3 files changed, 19 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/security/selinux/ss/policydb.c b/security/selinux/ss/policydb.c
> > > index 59359fa0bd74..a50d625e7946 100644
> > > --- a/security/selinux/ss/policydb.c
> > > +++ b/security/selinux/ss/policydb.c
> > > @@ -912,7 +912,7 @@ int policydb_load_isids(struct policydb *p, struct sidtab *s)
> > >                       sidtab_destroy(s);
> > >                       goto out;
> > >               }
> > > -             if (c->sid[0] > SECINITSID_NUM) {
> > > +             if (c->sid[0] == SECSID_NULL || c->sid[0] > SECINITSID_NUM) {
> > >                       pr_err("SELinux:  Initial SID %s out of range.\n",
> > >                               c->u.name);
> > >                       sidtab_destroy(s);
> > > diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
> > > index fd8115b211a6..e157d8240cf1 100644
> > > --- a/security/selinux/ss/sidtab.c
> > > +++ b/security/selinux/ss/sidtab.c
> > > @@ -23,7 +23,7 @@ int sidtab_init(struct sidtab *s)
> > >       if (!s->htable)
> > >               return -ENOMEM;
> > >
> > > -     for (i = 0; i <= SECINITSID_NUM; i++)
> > > +     for (i = 0; i < SECINITSID_NUM; i++)
> > >               s->isids[i].set = 0;
> > >
> > >       for (i = 0; i < SIDTAB_SIZE; i++)
> > > @@ -86,8 +86,15 @@ static int sidtab_insert(struct sidtab *s, u32 sid, struct context *context)
> > >
> > >   int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context)
> > >   {
> > > -     struct sidtab_isid_entry *entry = &s->isids[sid];
> > > -     int rc = context_cpy(&entry->context, context);
> > > +     struct sidtab_isid_entry *entry;
> > > +     int rc;
> > > +
> > > +     if (sid == 0 || sid > SECINITSID_NUM)
> > > +             return -EINVAL;
> > > +
> > > +     entry = &s->isids[sid - 1];
> > > +
> > > +     rc = context_cpy(&entry->context, context);
> > >       if (rc)
> > >               return rc;
> > >
> > > @@ -116,19 +123,19 @@ static struct context *sidtab_search_core(struct sidtab *s, u32 sid, int force)
> > >       struct context *context;
> > >       struct sidtab_isid_entry *entry;
> > >
> > > -     if (!s)
> > > +     if (!s || sid == 0)
> > >               return NULL;
> > >
> > >       if (sid > SECINITSID_NUM) {
> > >               context = sidtab_lookup(s, sid - (SECINITSID_NUM + 1));
> > >       } else {
> > > -             entry = &s->isids[sid];
> > > +             entry = &s->isids[sid - 1];
> > >               context = entry->set ? &entry->context : NULL;
> > >       }
> > >       if (context && (!context->len || force))
> > >               return context;
> > >
> > > -     entry = &s->isids[SECINITSID_UNLABELED];
> > > +     entry = &s->isids[SECINITSID_UNLABELED - 1];
> > >       return entry->set ? &entry->context : NULL;
> > >   }
> > >
> > > @@ -283,11 +290,11 @@ int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid)
> > >       int rc;
> > >       u32 i;
> > >
> > > -     for (i = 0; i <= SECINITSID_NUM; i++) {
> > > +     for (i = 0; i < SECINITSID_NUM; i++) {
> > >               struct sidtab_isid_entry *entry = &s->isids[i];
> > >
> > >               if (entry->set && context_cmp(context, &entry->context)) {
> > > -                     *sid = i;
> > > +                     *sid = i + 1;
> > >                       return 0;
> > >               }
> > >       }
> > > @@ -334,7 +341,7 @@ void sidtab_destroy(struct sidtab *s)
> > >       if (!s)
> > >               return;
> > >
> > > -     for (i = 0; i <= SECINITSID_NUM; i++)
> > > +     for (i = 0; i < SECINITSID_NUM; i++)
> > >               if (s->isids[i].set)
> > >                       context_destroy(&s->isids[i].context);
> > >
> > > diff --git a/security/selinux/ss/sidtab.h b/security/selinux/ss/sidtab.h
> > > index dc0a80bc8894..e657ae6bf996 100644
> > > --- a/security/selinux/ss/sidtab.h
> > > +++ b/security/selinux/ss/sidtab.h
> > > @@ -36,7 +36,8 @@ struct sidtab {
> > >       struct sidtab_node *cache[SIDTAB_CACHE_LEN];
> > >       spinlock_t lock;
> > >
> > > -     struct sidtab_isid_entry isids[SECINITSID_NUM + 1];
> > > +     /* index == SID - 1 (no entry for SECSID_NULL) */
> > > +     struct sidtab_isid_entry isids[SECINITSID_NUM];
> > >   };
> > >
> > >   int sidtab_init(struct sidtab *s);
> > >
> >
>
> --
> Ondrej Mosnacek <omosnace at redhat dot com>
> Associate Software Engineer, Security Technologies
> Red Hat, Inc.

--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-11-28 12:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-27 10:36 [RFC PATCH v2 0/4] Fix ENOMEM errors during policy reload Ondrej Mosnacek
2018-11-27 10:36 ` [RFC PATCH v2 1/4] selinux: use separate table for initial SID lookup Ondrej Mosnacek
2018-11-27 10:36 ` [RFC PATCH v2 2/4] [squash] do not store entry for SECSID_NULL Ondrej Mosnacek
2018-11-27 17:00   ` Stephen Smalley
2018-11-27 17:14     ` Stephen Smalley
2018-11-27 19:45     ` Ondrej Mosnacek
2018-11-28 12:07       ` Ondrej Mosnacek
2018-11-27 10:36 ` [RFC PATCH v2 3/4] selinux: overhaul sidtab to fix bug and improve performance Ondrej Mosnacek
2018-11-27 19:41   ` Stephen Smalley
2018-11-27 20:03     ` Ondrej Mosnacek
2018-11-27 10:36 ` [RFC PATCH v2 4/4] [squash] add back reverse lookup cache to sidtab Ondrej Mosnacek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).