All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Moore <paul@paul-moore.com>
To: Ondrej Mosnacek <omosnace@redhat.com>
Cc: selinux@vger.kernel.org,
	Stephen Smalley <stephen.smalley.work@gmail.com>,
	Stephen Smalley <sds@tycho.nsa.gov>
Subject: Re: [PATCH v2 2/3] selinux: prepare for inlining of hashtab functions
Date: Tue, 5 May 2020 07:34:30 -0400	[thread overview]
Message-ID: <CAHC9VhTbTTT4otsEWRPrPewz2j5FEJHODr8N0efG+cX7vph0Hg@mail.gmail.com> (raw)
In-Reply-To: <20200504115923.88828-3-omosnace@redhat.com>

On Mon, May 4, 2020 at 7:59 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
>
> Refactor searching and inserting into hashtabs to pave way for
> converting hashtab_search() and hashtab_insert() to inline functions in
> the next patch. This will avoid indirect calls and allow the compiler to
> better optimize individual callers, leading to a drastic performance
> improvement.
>
> In order to avoid the indirect calls, the key hashing and comparison
> callbacks need to be extracted from the hashtab struct and passed
> directly to hashtab_search()/_insert() by the callers so that it is
> always known which callbacks we want to call at compile time. Note that
> the kernel's rhashtable library (<linux/rhashtable*.h>) also does the
> same.
>
> This of course makes the hashtab functions more easy to misuse by
> passing a wrong calback set, but unfortunately there is no better way to
> implement a hash table that is both generic and efficient in C. This
> patch tries to somewhat mitigate this by only calling the hashtab
> functions in the same file where the corresponding callbacks are
> defined (wrapping them into more specialized functions as needed).
>
> Note that this patch doesn't bring any benefit without also defining
> hashtab_search() and -_insert() inline, which is done in a follow-up
> patch to make it easier to review the hashtab.c changes here.
>
> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> ---
>  security/selinux/ss/hashtab.c  | 44 ++++++++++----------
>  security/selinux/ss/hashtab.h  | 22 +++++-----
>  security/selinux/ss/mls.c      |  2 +-
>  security/selinux/ss/policydb.c | 76 ++++++++++++++++++++++++----------
>  security/selinux/ss/policydb.h |  9 ++++
>  security/selinux/ss/services.c |  4 +-
>  security/selinux/ss/symtab.c   | 16 ++++---
>  7 files changed, 110 insertions(+), 63 deletions(-)

...

> diff --git a/security/selinux/ss/hashtab.h b/security/selinux/ss/hashtab.h
> index 31c11511fe10..4885234257d4 100644
> --- a/security/selinux/ss/hashtab.h
> +++ b/security/selinux/ss/hashtab.h
> @@ -13,6 +13,12 @@
>
>  #define HASHTAB_MAX_NODES      0xffffffff
>
> +struct hashtab_key_params {
> +       u32 (*hash)(const void *key);   /* hash function */
> +       int (*cmp)(const void *key1, const void *key2);
> +                                       /* key comparison function */
> +};
> +
>  struct hashtab_node {
>         void *key;
>         void *datum;
> @@ -23,10 +29,6 @@ struct hashtab {
>         struct hashtab_node **htable;   /* hash table */
>         u32 size;                       /* number of slots in hash table */
>         u32 nel;                        /* number of elements in hash table */
> -       u32 (*hash_value)(struct hashtab *h, const void *key);
> -                                       /* hash function */
> -       int (*keycmp)(struct hashtab *h, const void *key1, const void *key2);
> -                                       /* key comparison function */
>  };

I remain concerned that in the process of chasing performance this
patchset makes the code worse and more fragile.

What if we took a slightly different approach such that instead of
breaking apart hashtab_node we moved the variable portions (htable,
size, nel) into a separate struct which we could reference in
hashtab_node?  For example:

  struct hashtab_var {
    struct hashtab_node **htable;
    u32 size;
    u32 nel;
  }

  struct hashtab {
    struct hashtab_var *table; // adds an extra ptr deref
    u32 (*hash_value)(...);
    int (*keycmp)(...);
  }

I might be mistaken, but I believe this should still allow you to
implement the inlining/pass-by-value tricks for search and insert
while maintaining a binding between the hashtable data and
hash/compare functions.  Yes?  Or does the entire struct need to be
declared as a static const for the compiler optimizations to work?  It
is not clear to me from the commit descriptions or comments in the
code.

As far as testing is concerned, whenever possible it is better to show
performance improvements in the context of some sort of user workload
and not an artificial stress test or micro benchmark.  I understand
that such focused tests can be attractive both in terms of their
results and ease of use, but they can also be very misleading.
Performance improvements for things like policy loads, kernel
compiles, etc. are much more interesting to me than results from
running something like stress-ng.

-- 
paul moore
www.paul-moore.com

  reply	other threads:[~2020-05-05 11:34 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-04 11:59 [PATCH v2 0/3] Inline some hashtab functions to improve performance Ondrej Mosnacek
2020-05-04 11:59 ` [PATCH v2 1/3] selinux: specialize symtab insert and search functions Ondrej Mosnacek
2020-05-04 11:59 ` [PATCH v2 2/3] selinux: prepare for inlining of hashtab functions Ondrej Mosnacek
2020-05-05 11:34   ` Paul Moore [this message]
2020-05-07  9:19     ` Ondrej Mosnacek
2020-05-04 11:59 ` [PATCH v2 3/3] selinux: complete the " Ondrej Mosnacek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHC9VhTbTTT4otsEWRPrPewz2j5FEJHODr8N0efG+cX7vph0Hg@mail.gmail.com \
    --to=paul@paul-moore.com \
    --cc=omosnace@redhat.com \
    --cc=sds@tycho.nsa.gov \
    --cc=selinux@vger.kernel.org \
    --cc=stephen.smalley.work@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.