From: Paul Moore <paul@paul-moore.com>
To: Ondrej Mosnacek <omosnace@redhat.com>
Cc: selinux@vger.kernel.org, Stephen Smalley <sds@tycho.nsa.gov>
Subject: Re: [PATCH v2 2/2] selinux: optimize storage of filename transitions
Date: Thu, 13 Feb 2020 19:35:41 -0500 [thread overview]
Message-ID: <CAHC9VhRwqRLNgycuX_MSYE83tFJBiresfiYRcz3RYX9Le+pTSw@mail.gmail.com> (raw)
In-Reply-To: <20200212112255.105678-3-omosnace@redhat.com>
On Wed, Feb 12, 2020 at 6:23 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
>
> In these rules, each rule with the same (target type, target class,
> filename) values is (in practice) always mapped to the same result type.
> Therefore, it is much more efficient to group the rules by (ttype,
> tclass, filename).
>
> Thus, this patch drops the stype field from the key and changes the
> datum to be a linked list of one or more structures that contain a
> result type and an ebitmap of source types that map the given target to
> the given result type under the given filename. The size of the hash
> table is also incremented to 2048 to be more optimal for Fedora policy
> (which currently has ~2500 unique (ttype, tclass, filename) tuples,
> regardless of whether the 'unconfined' module is enabled).
>
> Not only does this dramtically reduce memory usage when the policy
> contains a lot of unconfined domains (ergo a lot of filename based
> transitions), but it also slightly reduces memory usage of strongly
> confined policies (modeled on Fedora policy with 'unconfined' module
> disabled) and significantly reduces lookup times of these rules on
> Fedora (roughly matches the performance of the rhashtable conversion
> patch [1] posted recently to selinux@vger.kernel.org).
>
> An obvious next step is to change binary policy format to match this
> layout, so that disk space is also saved. However, since that requires
> more work (including matching userspace changes) and this patch is
> already beneficial on its own, I'm posting it separately.
>
> Performance/memory usage comparison:
>
> Kernel | Policy load | Policy load | Mem usage | Mem usage | openbench
> | | (-unconfined) | | (-unconfined) | (createfiles)
> -----------------|-------------|---------------|-----------|---------------|--------------
> reference | 1,30s | 0,91s | 90MB | 77MB | 55 us/file
> rhashtable patch | 0.98s | 0,85s | 85MB | 75MB | 38 us/file
> this patch | 0,95s | 0,87s | 75MB | 75MB | 40 us/file
>
> (Memory usage is measured after boot. With SELinux disabled the memory
> usage was ~60MB on the same system.)
>
> [1] https://lore.kernel.org/selinux/20200116213937.77795-1-dev@lynxeye.de/T/
>
> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> ---
> security/selinux/ss/policydb.c | 173 ++++++++++++++++++++-------------
> security/selinux/ss/policydb.h | 8 +-
> security/selinux/ss/services.c | 16 +--
> 3 files changed, 118 insertions(+), 79 deletions(-)
...
> diff --git a/security/selinux/ss/policydb.c b/security/selinux/ss/policydb.c
> index 981797bfc547..d8b72718e793 100644
> --- a/security/selinux/ss/policydb.c
> +++ b/security/selinux/ss/policydb.c
> @@ -1882,64 +1884,91 @@ out:
>
> static int filename_trans_read_one(struct policydb *p, void *fp)
> {
> - struct filename_trans *ft;
> - struct filename_trans_datum *otype = NULL;
> + struct filename_trans_key key, *ft = NULL;
> + struct filename_trans_datum *datum, *last, *newdatum = NULL;
> + uintptr_t stype, otype;
> char *name = NULL;
> u32 len;
> __le32 buf[4];
> int rc;
> -
> - ft = kzalloc(sizeof(*ft), GFP_KERNEL);
> - if (!ft)
> - return -ENOMEM;
> -
> - rc = -ENOMEM;
> - otype = kmalloc(sizeof(*otype), GFP_KERNEL);
> - if (!otype)
> - goto out;
> + bool already_there;
>
> /* length of the path component string */
> rc = next_entry(buf, fp, sizeof(u32));
> if (rc)
> - goto out;
> + return rc;
> len = le32_to_cpu(buf[0]);
>
> /* path component string */
> rc = str_read(&name, GFP_KERNEL, fp, len);
> if (rc)
> - goto out;
> -
> - ft->name = name;
> + return rc;
>
> rc = next_entry(buf, fp, sizeof(u32) * 4);
> if (rc)
> goto out;
>
> - ft->stype = le32_to_cpu(buf[0]);
> - ft->ttype = le32_to_cpu(buf[1]);
> - ft->tclass = le32_to_cpu(buf[2]);
> + stype = le32_to_cpu(buf[0]);
> + key.ttype = le32_to_cpu(buf[1]);
> + key.tclass = le32_to_cpu(buf[2]);
> + key.name = name;
We don't really need the "name" variable anymore do we, we can just
use "key.name" instead, right?
>
> - otype->otype = le32_to_cpu(buf[3]);
> + otype = le32_to_cpu(buf[3]);
>
> - rc = ebitmap_set_bit(&p->filename_trans_ttypes, ft->ttype, 1);
> - if (rc)
> - goto out;
> + already_there = false;
> + last = NULL;
> + datum = hashtab_search(p->filename_trans, &key);
> + while (datum) {
> + if (unlikely(ebitmap_get_bit(&datum->stypes, stype - 1))) {
> + already_there = true;
> + break;
Considering the "already_there" check below simply jumps to "out", why
do we need to add the "already_there" variable, can't we simply jump
to the "out" label here? Am I missing something?
> + }
> + if (likely(datum->otype == otype))
> + break;
> + last = datum;
> + datum = datum->next;
> + }
> + if (unlikely(already_there))
> + goto out; /* conflicting/duplicate rules are ignored */
> + if (!datum) {
> + rc = -ENOMEM;
> + newdatum = kmalloc(sizeof(*newdatum), GFP_KERNEL);
> + if (!newdatum)
> + goto out;
By definition "datum" will be NULL here so we can get rid of
"newdatum" and just reuse "datum", yes? I think the only place where
we would have to worry about the kfree(datum) in the "out" jump label
would be in this (!datum) if block which should be okay ...
> - rc = hashtab_insert(p->filename_trans, ft, otype);
> - if (rc) {
> - /*
> - * Do not return -EEXIST to the caller, or the system
> - * will not boot.
> - */
> - if (rc == -EEXIST)
> - rc = 0;
> - goto out;
> + ebitmap_init(&newdatum->stypes);
> + newdatum->otype = otype;
> + newdatum->next = NULL;
> +
> + if (unlikely(last)) {
> + last->next = newdatum;
> + } else {
> + rc = -ENOMEM;
> + ft = kzalloc(sizeof(*ft), GFP_KERNEL);
> + if (!ft)
> + goto out;
> +
> + *ft = key;
Off the top of my head I don't know the answer to this, and I worry it
may fall into the category of "undefined behavior", but if we are
assigning an entire struct to another dynamically allocated variable
using "=" is the kzalloc() necessary? Could we save ourselves a few
cycles and safely use kmalloc() for "ft"?
> + rc = hashtab_insert(p->filename_trans, ft, newdatum);
> + if (rc)
> + goto out;
> + name = NULL;
> +
> + rc = ebitmap_set_bit(&p->filename_trans_ttypes,
> + key.ttype, 1);
> + if (rc)
> + return rc;
> + }
> + datum = newdatum;
> }
> - return 0;
> + kfree(name);
> + return ebitmap_set_bit(&datum->stypes, stype - 1, 1);
> +
> out:
> kfree(ft);
> kfree(name);
> - kfree(otype);
> + kfree(newdatum);
> return rc;
> }
--
paul moore
www.paul-moore.com
next prev parent reply other threads:[~2020-02-14 0:35 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-12 11:22 [PATCH v2 0/2] Optimize storage of filename transitions Ondrej Mosnacek
2020-02-12 11:22 ` [PATCH v2 1/2] selinux: factor out loop body from filename_trans_read() Ondrej Mosnacek
2020-02-12 19:58 ` Stephen Smalley
2020-02-13 23:10 ` Paul Moore
2020-02-12 11:22 ` [PATCH v2 2/2] selinux: optimize storage of filename transitions Ondrej Mosnacek
2020-02-12 19:58 ` Stephen Smalley
2020-02-14 0:35 ` Paul Moore [this message]
2020-02-14 9:12 ` Ondrej Mosnacek
2020-02-17 23:20 ` Paul Moore
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAHC9VhRwqRLNgycuX_MSYE83tFJBiresfiYRcz3RYX9Le+pTSw@mail.gmail.com \
--to=paul@paul-moore.com \
--cc=omosnace@redhat.com \
--cc=sds@tycho.nsa.gov \
--cc=selinux@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).