selinux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paul Moore <paul@paul-moore.com>
To: Ondrej Mosnacek <omosnace@redhat.com>
Cc: selinux@vger.kernel.org, Stephen Smalley <sds@tycho.nsa.gov>
Subject: Re: [PATCH v2 2/2] selinux: optimize storage of filename transitions
Date: Thu, 13 Feb 2020 19:35:41 -0500	[thread overview]
Message-ID: <CAHC9VhRwqRLNgycuX_MSYE83tFJBiresfiYRcz3RYX9Le+pTSw@mail.gmail.com> (raw)
In-Reply-To: <20200212112255.105678-3-omosnace@redhat.com>

On Wed, Feb 12, 2020 at 6:23 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
>
> In these rules, each rule with the same (target type, target class,
> filename) values is (in practice) always mapped to the same result type.
> Therefore, it is much more efficient to group the rules by (ttype,
> tclass, filename).
>
> Thus, this patch drops the stype field from the key and changes the
> datum to be a linked list of one or more structures that contain a
> result type and an ebitmap of source types that map the given target to
> the given result type under the given filename. The size of the hash
> table is also incremented to 2048 to be more optimal for Fedora policy
> (which currently has ~2500 unique (ttype, tclass, filename) tuples,
> regardless of whether the 'unconfined' module is enabled).
>
> Not only does this dramtically reduce memory usage when the policy
> contains a lot of unconfined domains (ergo a lot of filename based
> transitions), but it also slightly reduces memory usage of strongly
> confined policies (modeled on Fedora policy with 'unconfined' module
> disabled) and significantly reduces lookup times of these rules on
> Fedora (roughly matches the performance of the rhashtable conversion
> patch [1] posted recently to selinux@vger.kernel.org).
>
> An obvious next step is to change binary policy format to match this
> layout, so that disk space is also saved. However, since that requires
> more work (including matching userspace changes) and this patch is
> already beneficial on its own, I'm posting it separately.
>
> Performance/memory usage comparison:
>
> Kernel           | Policy load | Policy load   | Mem usage | Mem usage     | openbench
>                  |             | (-unconfined) |           | (-unconfined) | (createfiles)
> -----------------|-------------|---------------|-----------|---------------|--------------
> reference        |       1,30s |         0,91s |      90MB |          77MB | 55 us/file
> rhashtable patch |       0.98s |         0,85s |      85MB |          75MB | 38 us/file
> this patch       |       0,95s |         0,87s |      75MB |          75MB | 40 us/file
>
> (Memory usage is measured after boot. With SELinux disabled the memory
> usage was ~60MB on the same system.)
>
> [1] https://lore.kernel.org/selinux/20200116213937.77795-1-dev@lynxeye.de/T/
>
> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> ---
>  security/selinux/ss/policydb.c | 173 ++++++++++++++++++++-------------
>  security/selinux/ss/policydb.h |   8 +-
>  security/selinux/ss/services.c |  16 +--
>  3 files changed, 118 insertions(+), 79 deletions(-)

...

> diff --git a/security/selinux/ss/policydb.c b/security/selinux/ss/policydb.c
> index 981797bfc547..d8b72718e793 100644
> --- a/security/selinux/ss/policydb.c
> +++ b/security/selinux/ss/policydb.c
> @@ -1882,64 +1884,91 @@ out:
>
>  static int filename_trans_read_one(struct policydb *p, void *fp)
>  {
> -       struct filename_trans *ft;
> -       struct filename_trans_datum *otype = NULL;
> +       struct filename_trans_key key, *ft = NULL;
> +       struct filename_trans_datum *datum, *last, *newdatum = NULL;
> +       uintptr_t stype, otype;
>         char *name = NULL;
>         u32 len;
>         __le32 buf[4];
>         int rc;
> -
> -       ft = kzalloc(sizeof(*ft), GFP_KERNEL);
> -       if (!ft)
> -               return -ENOMEM;
> -
> -       rc = -ENOMEM;
> -       otype = kmalloc(sizeof(*otype), GFP_KERNEL);
> -       if (!otype)
> -               goto out;
> +       bool already_there;
>
>         /* length of the path component string */
>         rc = next_entry(buf, fp, sizeof(u32));
>         if (rc)
> -               goto out;
> +               return rc;
>         len = le32_to_cpu(buf[0]);
>
>         /* path component string */
>         rc = str_read(&name, GFP_KERNEL, fp, len);
>         if (rc)
> -               goto out;
> -
> -       ft->name = name;
> +               return rc;
>
>         rc = next_entry(buf, fp, sizeof(u32) * 4);
>         if (rc)
>                 goto out;
>
> -       ft->stype = le32_to_cpu(buf[0]);
> -       ft->ttype = le32_to_cpu(buf[1]);
> -       ft->tclass = le32_to_cpu(buf[2]);
> +       stype = le32_to_cpu(buf[0]);
> +       key.ttype = le32_to_cpu(buf[1]);
> +       key.tclass = le32_to_cpu(buf[2]);
> +       key.name = name;

We don't really need the "name" variable anymore do we, we can just
use "key.name" instead, right?

>
> -       otype->otype = le32_to_cpu(buf[3]);
> +       otype = le32_to_cpu(buf[3]);
>
> -       rc = ebitmap_set_bit(&p->filename_trans_ttypes, ft->ttype, 1);
> -       if (rc)
> -               goto out;
> +       already_there = false;
> +       last = NULL;
> +       datum = hashtab_search(p->filename_trans, &key);
> +       while (datum) {
> +               if (unlikely(ebitmap_get_bit(&datum->stypes, stype - 1))) {
> +                       already_there = true;
> +                       break;

Considering the "already_there" check below simply jumps to "out", why
do we need to add the "already_there" variable, can't we simply jump
to the "out" label here?  Am I missing something?

> +               }
> +               if (likely(datum->otype == otype))
> +                       break;
> +               last = datum;
> +               datum = datum->next;
> +       }
> +       if (unlikely(already_there))
> +               goto out; /* conflicting/duplicate rules are ignored */
> +       if (!datum) {
> +               rc = -ENOMEM;
> +               newdatum = kmalloc(sizeof(*newdatum), GFP_KERNEL);
> +               if (!newdatum)
> +                       goto out;

By definition "datum" will be NULL here so we can get rid of
"newdatum" and just reuse "datum", yes?  I think the only place where
we would have to worry about the kfree(datum) in the "out" jump label
would be in this (!datum) if block which should be okay ...

> -       rc = hashtab_insert(p->filename_trans, ft, otype);
> -       if (rc) {
> -               /*
> -                * Do not return -EEXIST to the caller, or the system
> -                * will not boot.
> -                */
> -               if (rc == -EEXIST)
> -                       rc = 0;
> -               goto out;
> +               ebitmap_init(&newdatum->stypes);
> +               newdatum->otype = otype;
> +               newdatum->next = NULL;
> +
> +               if (unlikely(last)) {
> +                       last->next = newdatum;
> +               } else {
> +                       rc = -ENOMEM;
> +                       ft = kzalloc(sizeof(*ft), GFP_KERNEL);
> +                       if (!ft)
> +                               goto out;
> +
> +                       *ft = key;

Off the top of my head I don't know the answer to this, and I worry it
may fall into the category of "undefined behavior", but if we are
assigning an entire struct to another dynamically allocated variable
using "=" is the kzalloc() necessary?  Could we save ourselves a few
cycles and safely use kmalloc() for "ft"?

> +                       rc = hashtab_insert(p->filename_trans, ft, newdatum);
> +                       if (rc)
> +                               goto out;
> +                       name = NULL;
> +
> +                       rc = ebitmap_set_bit(&p->filename_trans_ttypes,
> +                                            key.ttype, 1);
> +                       if (rc)
> +                               return rc;
> +               }
> +               datum = newdatum;
>         }
> -       return 0;
> +       kfree(name);
> +       return ebitmap_set_bit(&datum->stypes, stype - 1, 1);
> +
>  out:
>         kfree(ft);
>         kfree(name);
> -       kfree(otype);
> +       kfree(newdatum);
>         return rc;
>  }

-- 
paul moore
www.paul-moore.com

  parent reply	other threads:[~2020-02-14  0:35 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-12 11:22 [PATCH v2 0/2] Optimize storage of filename transitions Ondrej Mosnacek
2020-02-12 11:22 ` [PATCH v2 1/2] selinux: factor out loop body from filename_trans_read() Ondrej Mosnacek
2020-02-12 19:58   ` Stephen Smalley
2020-02-13 23:10   ` Paul Moore
2020-02-12 11:22 ` [PATCH v2 2/2] selinux: optimize storage of filename transitions Ondrej Mosnacek
2020-02-12 19:58   ` Stephen Smalley
2020-02-14  0:35   ` Paul Moore [this message]
2020-02-14  9:12     ` Ondrej Mosnacek
2020-02-17 23:20       ` Paul Moore

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHC9VhRwqRLNgycuX_MSYE83tFJBiresfiYRcz3RYX9Le+pTSw@mail.gmail.com \
    --to=paul@paul-moore.com \
    --cc=omosnace@redhat.com \
    --cc=sds@tycho.nsa.gov \
    --cc=selinux@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).