Re: [PATCH RFC] selinux: policydb - convert filename trans hash to rhashtable

From: Ondrej Mosnacek <omosnace@redhat.com>
To: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Lucas Stach <dev@lynxeye.de>, Paul Moore <paul@paul-moore.com>,
	SElinux list <selinux@vger.kernel.org>,
	Linux kernel mailing list <linux-kernel@vger.kernel.org>,
	Richard Haines <richard_c_haines@btinternet.com>
Subject: Re: [PATCH RFC] selinux: policydb - convert filename trans hash to rhashtable
Date: Tue, 4 Feb 2020 16:49:24 +0100	[thread overview]
Message-ID: <CAFqZXNvih96sEODRJhFvCmx50ROWMb6vF1dK3sUJe_Q4hLiSzw@mail.gmail.com> (raw)
In-Reply-To: <f821809b-548d-fd95-6574-7c74c634353e@tycho.nsa.gov>

On Tue, Feb 4, 2020 at 4:39 PM Stephen Smalley <sds@tycho.nsa.gov> wrote:
> On 2/4/20 10:01 AM, Ondrej Mosnacek wrote:
> > On Fri, Jan 17, 2020 at 8:11 PM Stephen Smalley <sds@tycho.nsa.gov> wrote:
> >> On 1/16/20 4:39 PM, Lucas Stach wrote:
> >>> The current hash is too small for current usages in, e.g. the Fedora standard
> >>> policy. On file creates a considerable amount of CPU time is spent walking the
> >>> the hash chains. Increasing the number of hash buckets somewhat mitigates the
> >>> issue, but doesn't completely get rid of the long hash chains.
> >>>
> >>> This patch does take the bit more invasive route by converting the filename
> >>> trans hash to a rhashtable to allow this hash to scale with load.
> >>>
> >>> fs_mark create benchmark on a SSD device, no ramdisk:
> >>> Count          Size       Files/sec     App Overhead
> >>> before:
> >>> 10000          512        512.3           147715
> >>> after:
> >>> 10000          512        572.3            75141
> >>>
> >>> filenametr_cmp(), which was the topmost function in the CPU cycle trace before
> >>> at ~5% of the overall CPU time, is now down in the noise.
> >>
> >> Thank you for working on this.  IMHO, Fedora overuses name-based type
> >> transitions but that's another topic. I haven't yet investigated the
> >> root cause but with your patch applied, I see some test failures related
> >> to name-based transitions:
> >>
> >> [...]
> >> #   Failed test at overlay/test line 439.
> >> overlay/test ................ 114/119 # Looks like you failed 1 test of 119.
> >> [...]
> >> filesystem/test ............. 3/70 File context error, expected:
> >>          test_filesystem_filenametranscon1_t
> >> got:
> >>          test_filesystem_filetranscon_t
> >>
> >> #   Failed test at filesystem/test line 279.
> >> File context error, expected:
> >>          test_filesystem_filenametranscon2_t
> >> got:
> >>          test_filesystem_filetranscon_t
> >>
> >> #   Failed test at filesystem/test line 286.
> >> filesystem/test ............. 68/70 # Looks like you failed 2 tests of 70.
> >>
> >> You can reproduce by cloning the selinux-testsuite from
> >> https://github.com/SELinuxProject/selinux-testsuite, applying the
> >> filesystem tests patch from
> >> https://patchwork.kernel.org/patch/11337659/,
> >> and following the README.md instructions.
> >
> > I think I figured out what's wrong - see below.
> > <snip>
> >>> @@ -441,6 +442,39 @@ static int filenametr_cmp(struct hashtab *h, const void *k1, const void *k2)
> >>>
> >>>    }
> >>>
> >>> +static const struct rhashtable_params filename_trans_hashparams = {
> >>> +     .nelem_hint             = 1024,
> >>> +     .head_offset            = offsetof(struct filename_trans, hash_head),
> >
> > You need to add:
> >
> > +.hashfn = filenametr_hash,
> >
> > here so that the key is correctly hashed on lookup. After applying
> > this fix, the selinux-testuite passes for me with this patch.
>
> Hmm..does that then make the .obj_hashfn assignment below unnecessary or
> is that required too?  I'm unclear on the difference.

No, they serve different purposes - hashfn is used to hash the key you
pass when you do the lookup, while obj_hashfn is used to hash the full
objects stored in the hash table. In general, these can be different
types, but here we use the same type for both, so the same function
can be reused for both callbacks.

>
> >
> >>> +     .obj_hashfn             = filenametr_hash,
> >>> +     .obj_cmpfn              = filenametr_cmp,
> >>> +};
>

-- 
Ondrej Mosnacek <omosnace at redhat dot com>
Software Engineer, Security Technologies
Red Hat, Inc.