From: Ondrej Mosnacek <email@example.com> To: Will Deacon <firstname.lastname@example.org> Cc: Jeff Vander Stoep <email@example.com>, SElinux list <firstname.lastname@example.org>, Paul Moore <email@example.com>, Stephen Smalley <firstname.lastname@example.org>, Jovana Knezevic <email@example.com> Subject: Re: [PATCH v5] selinux: sidtab: reverse lookup hash table Date: Thu, 7 Nov 2019 19:41:31 +0100 Message-ID: <CAFqZXNuCugt05+4nejHYvkAGrW+WMmwsRQv8pjKEhKxdBMDVew@mail.gmail.com> (raw) In-Reply-To: <20191107164430.GA13483@willie-the-truck> On Thu, Nov 7, 2019 at 5:44 PM Will Deacon <firstname.lastname@example.org> wrote: > On Thu, Nov 07, 2019 at 05:12:53PM +0100, Ondrej Mosnacek wrote: > > On Thu, Nov 7, 2019 at 11:17 AM Jeff Vander Stoep <email@example.com> wrote: > > > This replaces the reverse table lookup and reverse cache with a > > > hashtable which improves cache-miss reverse-lookup times from > > > O(n) to O(1)* and maintains the same performance as a reverse > > > cache hit. > > > > > > This reduces the time needed to add a new sidtab entry from ~500us > > > to 5us on a Pixel 3 when there are ~10,000 sidtab entries. > > > > > > The implementation uses the kernel's generic hashtable API, > > > It uses the context's string represtation as the hash source, > > > and the kernels generic string hashing algorithm full_name_hash() > > > to reduce the string to a 32 bit value. > > > > > > This change also maintains the improvement introduced in > > > commit ee1a84fdfeed ("selinux: overhaul sidtab to fix bug and improve > > > performance") which removed the need to keep the current sidtab > > > locked during policy reload. It does however introduce periodic > > > locking of the target sidtab while converting the hashtable. Sidtab > > > entries are never modified or removed, so the context struct stored > > > in the sid_to_context tree can also be used for the context_to_sid > > > hashtable to reduce memory usage. > > > > > > This bug was reported by: > > > - On the selinux bug tracker. > > > BUG: kernel softlockup due to too many SIDs/contexts #37 > > > https://github.com/SELinuxProject/selinux-kernel/issues/37 > > > - Jovana Knezevic on Android's bugtracker. > > > Bug: 140252993 > > > "During multi-user performance testing, we create and remove users > > > many times. selinux_android_restorecon_pkgdir goes from 1ms to over > > > 20ms after about 200 user creations and removals. Accumulated over > > > ~280 packages, that adds a significant time to user creation, > > > making perf benchmarks unreliable." > > > > > > * Hashtable lookup is only O(1) when n < the number of buckets. > > > > > > Changes in V2: > > > -The hashtable uses sidtab_entry_leaf objects directly so these > > > objects are shared between the sid_to_context lookup tree and the > > > context_to_sid hashtable. This simplifies memory allocation and > > > was suggested by Ondrej Mosnacek. > > > -The new sidtab hash stats file in selinuxfs has been moved out of > > > the avc dir and into a new "ss" dir. > > > > > > V3: > > > -Add lock nesting notation. > > > > > > V4: > > > -Moved to *_rcu variants of the various hashtable functions > > > as suggested by Ondrej Mosnacek and Will Deacon. > > > > I may be misunderstanding the purpose of RCU, but isn't all this RCU > > stuff a big overkill when these entries are never removed? (Well, they > > are removed when the sidtab is being destroyed, at which point however > > no other threads are accessing them any more.) In my understanding, > > RCU basically makes sure that objects that you dereference in an RCU > > critical section are not destroyed until you leave it. Yes, it also > > helps you to dereference/update them safely, but that can be achieved > > on its own in a simpler way. Instead of using RCU here, I would > > instead suggest looking into adding an equivalent of > > list_for_each_entry_lockless()* for hlist and maybe introduce some > > suitable hlist_add_something() function that ensures consistency > > w.r.t. the lockless traversal (perhaps the WRITE_ONCE() in hlist_add() > > is already sufficient, but I'm not sure...). > > If you use the existing _rcu accessors you will get correctly enforced > dependency ordering on the reader side and correctly placed release > ordering on the updater side. I don't think that's a big overkill, and > you can use the RCU accessors to achieve the lockless traversal. OK, but you don't need the read_lock()/unlock(), rcu_head field in the entries, and kfree_rcu(), right? > > hlist_add() is not safe against a concurrent traversal because the > WRITE_ONCE() provides no memory ordering guarantees, allowing the readers > to see an uninitialised node. Right, so we would need a new function that does smp_store_release() instead. > > How exactly would list_for_each_entry_lockless() and hlist_add_something() > differ from the RCU variants, assuming they're implemented correctly? Looking at the implementation of rcu_dereference() and rcu_assign_pointer(), they would probably be practically the same, except the rcu_read_lock_held() check in rcu_dereference(). That and they would clearly communicate that they are not doing actual RCU, but just allow lockless traversal of an add-only hlist. -- Ondrej Mosnacek <omosnace at redhat dot com> Software Engineer, Security Technologies Red Hat, Inc.
next prev parent reply index Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-11-07 10:17 Jeff Vander Stoep 2019-11-07 12:01 ` Jeffrey Vander Stoep 2019-11-14 23:35 ` Paul Moore 2019-11-07 14:54 ` Stephen Smalley 2019-11-07 15:42 ` Ondrej Mosnacek 2019-11-07 15:59 ` Jeffrey Vander Stoep 2019-11-07 15:49 ` Jeffrey Vander Stoep 2019-11-07 16:12 ` Ondrej Mosnacek 2019-11-07 16:44 ` Will Deacon 2019-11-07 18:41 ` Ondrej Mosnacek [this message] 2019-11-07 20:06 ` Jeffrey Vander Stoep
Reply instructions: You may reply publically to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CAFqZXNuCugt05+4nejHYvkAGrW+WMmwsRQv8pjKEhKxdBMDVew@mail.gmail.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
SELinux Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/selinux/0 selinux/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 selinux selinux/ https://lore.kernel.org/selinux \ email@example.com public-inbox-index selinux Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.selinux AGPL code for this site: git clone https://public-inbox.org/public-inbox.git