selinux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Will Deacon <will@kernel.org>
To: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Jeff Vander Stoep <jeffv@google.com>,
	SElinux list <selinux@vger.kernel.org>,
	Paul Moore <paul@paul-moore.com>,
	Stephen Smalley <sds@tycho.nsa.gov>,
	Jovana Knezevic <jovanak@google.com>
Subject: Re: [PATCH v5] selinux: sidtab: reverse lookup hash table
Date: Thu, 7 Nov 2019 16:44:30 +0000	[thread overview]
Message-ID: <20191107164430.GA13483@willie-the-truck> (raw)
In-Reply-To: <CAFqZXNvn4tTgNcQ3-fFE5QN7dS93h2VNznZ-D=3m2gZ26di9hA@mail.gmail.com>

On Thu, Nov 07, 2019 at 05:12:53PM +0100, Ondrej Mosnacek wrote:
> On Thu, Nov 7, 2019 at 11:17 AM Jeff Vander Stoep <jeffv@google.com> wrote:
> > This replaces the reverse table lookup and reverse cache with a
> > hashtable which improves cache-miss reverse-lookup times from
> > O(n) to O(1)* and maintains the same performance as a reverse
> > cache hit.
> >
> > This reduces the time needed to add a new sidtab entry from ~500us
> > to 5us on a Pixel 3 when there are ~10,000 sidtab entries.
> >
> > The implementation uses the kernel's generic hashtable API,
> > It uses the context's string represtation as the hash source,
> > and the kernels generic string hashing algorithm full_name_hash()
> > to reduce the string to a 32 bit value.
> >
> > This change also maintains the improvement introduced in
> > commit ee1a84fdfeed ("selinux: overhaul sidtab to fix bug and improve
> > performance") which removed the need to keep the current sidtab
> > locked during policy reload. It does however introduce periodic
> > locking of the target sidtab while converting the hashtable. Sidtab
> > entries are never modified or removed, so the context struct stored
> > in the sid_to_context tree can also be used for the context_to_sid
> > hashtable to reduce memory usage.
> >
> > This bug was reported by:
> > - On the selinux bug tracker.
> >   BUG: kernel softlockup due to too many SIDs/contexts #37
> >   https://github.com/SELinuxProject/selinux-kernel/issues/37
> > - Jovana Knezevic on Android's bugtracker.
> >   Bug: 140252993
> >   "During multi-user performance testing, we create and remove users
> >   many times. selinux_android_restorecon_pkgdir goes from 1ms to over
> >   20ms after about 200 user creations and removals. Accumulated over
> >   ~280 packages, that adds a significant time to user creation,
> >   making perf benchmarks unreliable."
> >
> > * Hashtable lookup is only O(1) when n < the number of buckets.
> >
> > Changes in V2:
> > -The hashtable uses sidtab_entry_leaf objects directly so these
> > objects are shared between the sid_to_context lookup tree and the
> > context_to_sid hashtable. This simplifies memory allocation and
> > was suggested by Ondrej Mosnacek.
> > -The new sidtab hash stats file in selinuxfs has been moved out of
> > the avc dir and into a new "ss" dir.
> >
> > V3:
> > -Add lock nesting notation.
> >
> > V4:
> > -Moved to *_rcu variants of the various hashtable functions
> > as suggested by Ondrej Mosnacek and Will Deacon.
> 
> I may be misunderstanding the purpose of RCU, but isn't all this RCU
> stuff a big overkill when these entries are never removed? (Well, they
> are removed when the sidtab is being destroyed, at which point however
> no other threads are accessing them any more.) In my understanding,
> RCU basically makes sure that objects that you dereference in an RCU
> critical section are not destroyed until you leave it. Yes, it also
> helps you to dereference/update them safely, but that can be achieved
> on its own in a simpler way. Instead of using RCU here, I would
> instead suggest looking into adding an equivalent of
> list_for_each_entry_lockless()* for hlist and maybe introduce some
> suitable hlist_add_something() function that ensures consistency
> w.r.t. the lockless traversal (perhaps the WRITE_ONCE() in hlist_add()
> is already sufficient, but I'm not sure...).

If you use the existing _rcu accessors you will get correctly enforced
dependency ordering on the reader side and correctly placed release
ordering on the updater side. I don't think that's a big overkill, and
you can use the RCU accessors to achieve the lockless traversal.

hlist_add() is not safe against a concurrent traversal because the
WRITE_ONCE() provides no memory ordering guarantees, allowing the readers
to see an uninitialised node.

How exactly would list_for_each_entry_lockless() and hlist_add_something()
differ from the RCU variants, assuming they're implemented correctly?

Will

  reply	other threads:[~2019-11-07 16:44 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-07 10:17 [PATCH v5] selinux: sidtab: reverse lookup hash table Jeff Vander Stoep
2019-11-07 12:01 ` Jeffrey Vander Stoep
2019-11-14 23:35   ` Paul Moore
2019-11-07 14:54 ` Stephen Smalley
2019-11-07 15:42   ` Ondrej Mosnacek
2019-11-07 15:59     ` Jeffrey Vander Stoep
2019-11-07 15:49   ` Jeffrey Vander Stoep
2019-11-07 16:12 ` Ondrej Mosnacek
2019-11-07 16:44   ` Will Deacon [this message]
2019-11-07 18:41     ` Ondrej Mosnacek
2019-11-07 20:06       ` Jeffrey Vander Stoep

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191107164430.GA13483@willie-the-truck \
    --to=will@kernel.org \
    --cc=jeffv@google.com \
    --cc=jovanak@google.com \
    --cc=omosnace@redhat.com \
    --cc=paul@paul-moore.com \
    --cc=sds@tycho.nsa.gov \
    --cc=selinux@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).