selinux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ondrej Mosnacek <omosnace@redhat.com>
To: William Roberts <bill.c.roberts@gmail.com>
Cc: Nicolas Iooss <nicolas.iooss@m4x.org>,
	SElinux list <selinux@vger.kernel.org>,
	Stephen Smalley <sds@tycho.nsa.gov>
Subject: Re: [PATCH userspace v2] libsepol: cache ebitmap cardinality value
Date: Wed, 26 Feb 2020 14:23:22 +0100	[thread overview]
Message-ID: <CAFqZXNvaA7QA7SQryU-nm8Az72MDOxg69S8MpkiidND0FFR0-w@mail.gmail.com> (raw)
In-Reply-To: <CAFftDdrEqVV-q-nJZmPUEYwr56YPPGPjv3WE2400QkTqyTo-rQ@mail.gmail.com>

On Tue, Feb 25, 2020 at 10:57 PM William Roberts
<bill.c.roberts@gmail.com> wrote:
> On Tue, Feb 25, 2020 at 3:33 PM Nicolas Iooss <nicolas.iooss@m4x.org> wrote:
> >
> > On Tue, Feb 18, 2020 at 5:01 PM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> > >
> > > On Tue, Feb 18, 2020 at 4:40 PM Stephen Smalley <sds@tycho.nsa.gov> wrote:
> > > > On 2/18/20 10:22 AM, Ondrej Mosnacek wrote:
> > > > > On Thu, Feb 13, 2020 at 2:40 PM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> > > > >> According to profiling of semodule -BN, ebitmap_cardinality() is called
> > > > >> quite often and contributes a lot to the total runtime. Cache its result
> > > > >> in the ebitmap struct to reduce this overhead. The cached value is
> > > > >> invalidated on most modifying operations, but ebitmap_cardinality() is
> > > > >> usually called once the ebitmap doesn't change any more.
> > > > >>
> > > > >> After this patch, the time to do 'semodule -BN' on Fedora Rawhide has
> > > > >> decreased from ~14.6s to ~12.4s (2.2s saved).
> > > > >
> > > > > I have no idea why, but I'm now getting completely different times
> > > > > (10.9s vs. 8.9s) with the same builds on the same setup... I can no
> > > > > longer reproduce the slower times anywhere (F31/locally/...) so I have
> > > > > to assume it was some kind of glitch. Since the numbers show a similar
> > > > > magnitude of speed-up (and they depend on a bunch of HW/SW factors
> > > > > anyway), I'm not going to do another respin. The applying person (most
> > > > > likely Stephen) is free to fix the numbers when applying if they wish
> > > > > to do so.
> > > >
> > > > Thanks, applied with fixed times (although I don't really think it
> > > > matters very much).  Maybe you're also picking up the difference from
> > > > the "libsepol/cil: remove unnecessary hash tables" change.
> > >
> > > No, that was actually the reason for the first correction.
> >
> > Hello,
> > About performance issues, the current implementation of
> > ebitmap_cardinality() is quadratic:
> >
> > for (i=ebitmap_startbit(e1); i < ebitmap_length(e1); i++)
> >     if (ebitmap_get_bit(e1, i))
> >         count++;
> >
> > ... because ebitmap_get_bit() browse the bitmap:
> >
> > while (n && (n->startbit <= bit)) {
> >    if ((n->startbit + MAPSIZE) > bit) {
> >       /*... */

Hm... I didn't realize that the function is actually quadratic.

> >
> > A few years ago, I tried modifying this function to make it linear in
> > the bitmap size:
> >
> > unsigned int ebitmap_cardinality(ebitmap_t *e1)
> > {
> >     unsigned int count = 0;
> >     ebitmap_node_t *n;
> >
> >    for (n = e1->node; n; n = n->next) {
> >         count += __builtin_popcountll(n->map);
> >     }
> >     return count;
> > }
> >
> > ... but never actually sent a patch for this, because I wanted to
> > assess how __builtin_popcountll() was supported by several compilers
> > beforehand. Would this be helpful to gain even more performance gain?
>
> Every architecture I've used has an instruction it boils down to:
> x86 - POPCNT
> ARM (neon): vcnt

Note that the compiler will only emit these instructions if you
compile with the right target platform (-mpopcnt or something that
includes it on x86_64). Portable generic builds will usually not use
it. Still, even without the special instruction __builtin_popcountll()
should generate more optimal code than the naive
add-each-bit-one-by-one approach. For example, I came up with this
pure C implementation of 64-bit popcount [1] that both GCC and Clang
can compile down to ~36 instructions. The generic version of
__builtin_popcountll() likely does something similar. (Actually, here
is what Clang seems to use [2], which is pretty close.)

FWIW, I tested the __builtin_popcountll() version with the caching
patch reverted (built without popcnt support) and it actually
performed even better than the old code + caching (it went down to
~0.11% of semodule -B running time). A naive popcount implementation
without caching didn't perform as good (was slower than the old code +
caching).

So... we could just open-code some good generic C implementation
(cleanly written and properly commented, of course) and then we
wouldn't have to rely on the compiler builtin. OTOH, the SELinux
userspace already uses non-standard compiler extensions
(__attribute__(...)), so maybe sticking to pure C is not worth it...
Either way I think we should revert the caching patch along with
switching to an optimized implementation (it would no longer be worth
the added complexity IMO).

[1] https://gcc.godbolt.org/z/39W7qa
[2] https://github.com/llvm-mirror/compiler-rt/blob/master/lib/builtins/popcountdi2.c

>
> For others, (do they even matter at this point) I would imagine GCC
> does something relatively sane.
>
> >
> > Cheers,
> > Nicolas
> >
>
--
Ondrej Mosnacek <omosnace at redhat dot com>
Software Engineer, Security Technologies
Red Hat, Inc.


  reply	other threads:[~2020-02-26 13:23 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-13 13:39 [PATCH userspace v2] libsepol: cache ebitmap cardinality value Ondrej Mosnacek
2020-02-14 17:38 ` Stephen Smalley
2020-02-14 18:20   ` Stephen Smalley
2020-02-14 19:51   ` Ondrej Mosnacek
2020-02-14 19:57     ` Stephen Smalley
2020-02-14 20:19     ` Ondrej Mosnacek
2020-02-18 15:22 ` Ondrej Mosnacek
2020-02-18 15:41   ` Stephen Smalley
2020-02-18 16:01     ` Ondrej Mosnacek
2020-02-25 21:24       ` Nicolas Iooss
2020-02-25 21:56         ` William Roberts
2020-02-26 13:23           ` Ondrej Mosnacek [this message]
2020-02-26 15:39             ` William Roberts

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFqZXNvaA7QA7SQryU-nm8Az72MDOxg69S8MpkiidND0FFR0-w@mail.gmail.com \
    --to=omosnace@redhat.com \
    --cc=bill.c.roberts@gmail.com \
    --cc=nicolas.iooss@m4x.org \
    --cc=sds@tycho.nsa.gov \
    --cc=selinux@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).