All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian Göttsche" <cgzones@googlemail.com>
To: selinux@vger.kernel.org
Cc: Paul Moore <paul@paul-moore.com>,
	Stephen Smalley <stephen.smalley.work@gmail.com>,
	Ondrej Mosnacek <omosnace@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: [PATCH v2 2/2] selinux: improve symtab string hashing
Date: Fri, 15 Mar 2024 19:14:04 +0100	[thread overview]
Message-ID: <20240315181414.649045-1-cgzones@googlemail.com> (raw)

The number of buckets is calculated by performing a binary AND against
the mask of the hash table, which is one less than its size (which is a
power of two).  This leads to all top bits being discarded, requiring
for short or similar inputs a hash function with a good avalanche
effect.

Use djb2a:

    # current
    common prefixes:  7 entries and 5/8 buckets used, longest chain length 2, sum of chain length^2 11
    classes:  134 entries and 100/256 buckets used, longest chain length 5, sum of chain length^2 234
    roles:  15 entries and 6/16 buckets used, longest chain length 5, sum of chain length^2 57
    types:  4448 entries and 3016/8192 buckets used, longest chain length 41, sum of chain length^2 14922
    users:  7 entries and 3/8 buckets used, longest chain length 3, sum of chain length^2 17
    bools:  306 entries and 221/512 buckets used, longest chain length 4, sum of chain length^2 524
    levels:  1 entries and 1/1 buckets used, longest chain length 1, sum of chain length^2 1
    categories:  1024 entries and 400/1024 buckets used, longest chain length 4, sum of chain length^2 2740

    # patch
    common prefixes:  7 entries and 5/8 buckets used, longest chain length 2, sum of chain length^2 11
    classes:  134 entries and 101/256 buckets used, longest chain length 3, sum of chain length^2 210
    roles:  15 entries and 9/16 buckets used, longest chain length 3, sum of chain length^2 31
    types:  4448 entries and 3459/8192 buckets used, longest chain length 5, sum of chain length^2 6778
    users:  7 entries and 5/8 buckets used, longest chain length 3, sum of chain length^2 13
    bools:  306 entries and 236/512 buckets used, longest chain length 5, sum of chain length^2 470
    levels:  1 entries and 1/1 buckets used, longest chain length 1, sum of chain length^2 1
    categories:  1024 entries and 518/1024 buckets used, longest chain length 7, sum of chain length^2 2992

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
---
v2:
   add licensing note
---
 security/selinux/ss/symtab.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/security/selinux/ss/symtab.c b/security/selinux/ss/symtab.c
index c04f8d447873..832660fd84a9 100644
--- a/security/selinux/ss/symtab.c
+++ b/security/selinux/ss/symtab.c
@@ -12,17 +12,17 @@
 
 static unsigned int symhash(const void *key)
 {
-	const char *p, *keyp;
-	unsigned int size;
-	unsigned int val;
-
-	val = 0;
-	keyp = key;
-	size = strlen(keyp);
-	for (p = keyp; (p - keyp) < size; p++)
-		val = (val << 4 | (val >> (8 * sizeof(unsigned int) - 4))) ^
-		      (*p);
-	return val;
+	/*
+	 * djb2a
+	 * Public domain from cdb v0.75
+	 */
+	unsigned int hash = 5381;
+	unsigned char c;
+
+	while ((c = *(const unsigned char *)key++))
+		hash = ((hash << 5) + hash) ^ c;
+
+	return hash;
 }
 
 static int symcmp(const void *key1, const void *key2)
-- 
2.43.0


             reply	other threads:[~2024-03-15 18:14 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-15 18:14 Christian Göttsche [this message]
2024-03-15 18:14 ` [PATCH v2 1/2] selinux: dump statistics for more hash tables Christian Göttsche
2024-03-27 23:26   ` Paul Moore
2024-03-27 23:26 ` [PATCH v2 2/2] selinux: improve symtab string hashing Paul Moore

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240315181414.649045-1-cgzones@googlemail.com \
    --to=cgzones@googlemail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=omosnace@redhat.com \
    --cc=paul@paul-moore.com \
    --cc=selinux@vger.kernel.org \
    --cc=stephen.smalley.work@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.