All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Moore <paul@paul-moore.com>
To: Ondrej Mosnacek <omosnace@redhat.com>
Cc: selinux@vger.kernel.org, Stephen Smalley <sds@tycho.nsa.gov>
Subject: Re: [PATCH v2 2/2] selinux: optimize storage of filename transitions
Date: Thu, 13 Feb 2020 19:35:41 -0500	[thread overview]
Message-ID: <CAHC9VhRwqRLNgycuX_MSYE83tFJBiresfiYRcz3RYX9Le+pTSw@mail.gmail.com> (raw)
In-Reply-To: <20200212112255.105678-3-omosnace@redhat.com>

On Wed, Feb 12, 2020 at 6:23 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
>
> In these rules, each rule with the same (target type, target class,
> filename) values is (in practice) always mapped to the same result type.
> Therefore, it is much more efficient to group the rules by (ttype,
> tclass, filename).
>
> Thus, this patch drops the stype field from the key and changes the
> datum to be a linked list of one or more structures that contain a
> result type and an ebitmap of source types that map the given target to
> the given result type under the given filename. The size of the hash
> table is also incremented to 2048 to be more optimal for Fedora policy
> (which currently has ~2500 unique (ttype, tclass, filename) tuples,
> regardless of whether the 'unconfined' module is enabled).
>
> Not only does this dramtically reduce memory usage when the policy
> contains a lot of unconfined domains (ergo a lot of filename based
> transitions), but it also slightly reduces memory usage of strongly
> confined policies (modeled on Fedora policy with 'unconfined' module
> disabled) and significantly reduces lookup times of these rules on
> Fedora (roughly matches the performance of the rhashtable conversion
> patch [1] posted recently to selinux@vger.kernel.org).
>
> An obvious next step is to change binary policy format to match this
> layout, so that disk space is also saved. However, since that requires
> more work (including matching userspace changes) and this patch is
> already beneficial on its own, I'm posting it separately.
>
> Performance/memory usage comparison:
>
> Kernel           | Policy load | Policy load   | Mem usage | Mem usage     | openbench
>                  |             | (-unconfined) |           | (-unconfined) | (createfiles)
> -----------------|-------------|---------------|-----------|---------------|--------------
> reference        |       1,30s |         0,91s |      90MB |          77MB | 55 us/file
> rhashtable patch |       0.98s |         0,85s |      85MB |          75MB | 38 us/file
> this patch       |       0,95s |         0,87s |      75MB |          75MB | 40 us/file
>
> (Memory usage is measured after boot. With SELinux disabled the memory
> usage was ~60MB on the same system.)
>
> [1] https://lore.kernel.org/selinux/20200116213937.77795-1-dev@lynxeye.de/T/
>
> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> ---
>  security/selinux/ss/policydb.c | 173 ++++++++++++++++++++-------------
>  security/selinux/ss/policydb.h |   8 +-
>  security/selinux/ss/services.c |  16 +--
>  3 files changed, 118 insertions(+), 79 deletions(-)

...

> diff --git a/security/selinux/ss/policydb.c b/security/selinux/ss/policydb.c
> index 981797bfc547..d8b72718e793 100644
> --- a/security/selinux/ss/policydb.c
> +++ b/security/selinux/ss/policydb.c
> @@ -1882,64 +1884,91 @@ out:
>
>  static int filename_trans_read_one(struct policydb *p, void *fp)
>  {
> -       struct filename_trans *ft;
> -       struct filename_trans_datum *otype = NULL;
> +       struct filename_trans_key key, *ft = NULL;
> +       struct filename_trans_datum *datum, *last, *newdatum = NULL;
> +       uintptr_t stype, otype;
>         char *name = NULL;
>         u32 len;
>         __le32 buf[4];
>         int rc;
> -
> -       ft = kzalloc(sizeof(*ft), GFP_KERNEL);
> -       if (!ft)
> -               return -ENOMEM;
> -
> -       rc = -ENOMEM;
> -       otype = kmalloc(sizeof(*otype), GFP_KERNEL);
> -       if (!otype)
> -               goto out;
> +       bool already_there;
>
>         /* length of the path component string */
>         rc = next_entry(buf, fp, sizeof(u32));
>         if (rc)
> -               goto out;
> +               return rc;
>         len = le32_to_cpu(buf[0]);
>
>         /* path component string */
>         rc = str_read(&name, GFP_KERNEL, fp, len);
>         if (rc)
> -               goto out;
> -
> -       ft->name = name;
> +               return rc;
>
>         rc = next_entry(buf, fp, sizeof(u32) * 4);
>         if (rc)
>                 goto out;
>
> -       ft->stype = le32_to_cpu(buf[0]);
> -       ft->ttype = le32_to_cpu(buf[1]);
> -       ft->tclass = le32_to_cpu(buf[2]);
> +       stype = le32_to_cpu(buf[0]);
> +       key.ttype = le32_to_cpu(buf[1]);
> +       key.tclass = le32_to_cpu(buf[2]);
> +       key.name = name;

We don't really need the "name" variable anymore do we, we can just
use "key.name" instead, right?

>
> -       otype->otype = le32_to_cpu(buf[3]);
> +       otype = le32_to_cpu(buf[3]);
>
> -       rc = ebitmap_set_bit(&p->filename_trans_ttypes, ft->ttype, 1);
> -       if (rc)
> -               goto out;
> +       already_there = false;
> +       last = NULL;
> +       datum = hashtab_search(p->filename_trans, &key);
> +       while (datum) {
> +               if (unlikely(ebitmap_get_bit(&datum->stypes, stype - 1))) {
> +                       already_there = true;
> +                       break;

Considering the "already_there" check below simply jumps to "out", why
do we need to add the "already_there" variable, can't we simply jump
to the "out" label here?  Am I missing something?

> +               }
> +               if (likely(datum->otype == otype))
> +                       break;
> +               last = datum;
> +               datum = datum->next;
> +       }
> +       if (unlikely(already_there))
> +               goto out; /* conflicting/duplicate rules are ignored */
> +       if (!datum) {
> +               rc = -ENOMEM;
> +               newdatum = kmalloc(sizeof(*newdatum), GFP_KERNEL);
> +               if (!newdatum)
> +                       goto out;

By definition "datum" will be NULL here so we can get rid of
"newdatum" and just reuse "datum", yes?  I think the only place where
we would have to worry about the kfree(datum) in the "out" jump label
would be in this (!datum) if block which should be okay ...

> -       rc = hashtab_insert(p->filename_trans, ft, otype);
> -       if (rc) {
> -               /*
> -                * Do not return -EEXIST to the caller, or the system
> -                * will not boot.
> -                */
> -               if (rc == -EEXIST)
> -                       rc = 0;
> -               goto out;
> +               ebitmap_init(&newdatum->stypes);
> +               newdatum->otype = otype;
> +               newdatum->next = NULL;
> +
> +               if (unlikely(last)) {
> +                       last->next = newdatum;
> +               } else {
> +                       rc = -ENOMEM;
> +                       ft = kzalloc(sizeof(*ft), GFP_KERNEL);
> +                       if (!ft)
> +                               goto out;
> +
> +                       *ft = key;

Off the top of my head I don't know the answer to this, and I worry it
may fall into the category of "undefined behavior", but if we are
assigning an entire struct to another dynamically allocated variable
using "=" is the kzalloc() necessary?  Could we save ourselves a few
cycles and safely use kmalloc() for "ft"?

> +                       rc = hashtab_insert(p->filename_trans, ft, newdatum);
> +                       if (rc)
> +                               goto out;
> +                       name = NULL;
> +
> +                       rc = ebitmap_set_bit(&p->filename_trans_ttypes,
> +                                            key.ttype, 1);
> +                       if (rc)
> +                               return rc;
> +               }
> +               datum = newdatum;
>         }
> -       return 0;
> +       kfree(name);
> +       return ebitmap_set_bit(&datum->stypes, stype - 1, 1);
> +
>  out:
>         kfree(ft);
>         kfree(name);
> -       kfree(otype);
> +       kfree(newdatum);
>         return rc;
>  }

-- 
paul moore
www.paul-moore.com

  parent reply	other threads:[~2020-02-14  0:35 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-12 11:22 [PATCH v2 0/2] Optimize storage of filename transitions Ondrej Mosnacek
2020-02-12 11:22 ` [PATCH v2 1/2] selinux: factor out loop body from filename_trans_read() Ondrej Mosnacek
2020-02-12 19:58   ` Stephen Smalley
2020-02-13 23:10   ` Paul Moore
2020-02-12 11:22 ` [PATCH v2 2/2] selinux: optimize storage of filename transitions Ondrej Mosnacek
2020-02-12 19:58   ` Stephen Smalley
2020-02-14  0:35   ` Paul Moore [this message]
2020-02-14  9:12     ` Ondrej Mosnacek
2020-02-17 23:20       ` Paul Moore

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHC9VhRwqRLNgycuX_MSYE83tFJBiresfiYRcz3RYX9Le+pTSw@mail.gmail.com \
    --to=paul@paul-moore.com \
    --cc=omosnace@redhat.com \
    --cc=sds@tycho.nsa.gov \
    --cc=selinux@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.