From: Marco Elver <elver@google.com>
To: Jann Horn <jannh@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Alexander Potapenko <glider@google.com>,
"H . Peter Anvin" <hpa@zytor.com>,
"Paul E . McKenney" <paulmck@kernel.org>,
Andrey Konovalov <andreyknvl@google.com>,
Andrey Ryabinin <aryabinin@virtuozzo.com>,
Andy Lutomirski <luto@kernel.org>, Borislav Petkov <bp@alien8.de>,
Catalin Marinas <catalin.marinas@arm.com>,
Christoph Lameter <cl@linux.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
David Rientjes <rientjes@google.com>,
Dmitry Vyukov <dvyukov@google.com>,
Eric Dumazet <edumazet@google.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Hillf Danton <hdanton@sina.com>, Ingo Molnar <mingo@redhat.com>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>,
Jonathan Corbet <corbet@lwn.net>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Kees Cook <keescook@chromium.org>,
Mark Rutland <mark.rutland@arm.com>,
Pekka Enberg <penberg@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
SeongJae Park <sjpark@amazon.com>,
Thomas Gleixner <tglx@linutronix.de>,
Vlastimil Babka <vbabka@suse.cz>, Will Deacon <will@kernel.org>,
"the arch/x86 maintainers" <x86@kernel.org>,
"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
kernel list <linux-kernel@vger.kernel.org>,
kasan-dev <kasan-dev@googlegroups.com>,
Linux ARM <linux-arm-kernel@lists.infradead.org>,
Linux-MM <linux-mm@kvack.org>, SeongJae Park <sjpark@amazon.de>
Subject: Re: [PATCH v4 01/11] mm: add Kernel Electric-Fence infrastructure
Date: Fri, 2 Oct 2020 23:12:16 +0200 [thread overview]
Message-ID: <20201002211216.GA1108095@elver.google.com> (raw)
In-Reply-To: <CAG48ez0D1+hStZaDOigwbqNqFHJAJtXK+8Nadeuiu1Byv+xp5A@mail.gmail.com>
On Fri, Oct 02, 2020 at 09:31PM +0200, Jann Horn wrote:
[...]
> > >
> > > If !CONFIG_HAVE_ARCH_KFENCE_STATIC_POOL, this should probably always
> > > return false if __kfence_pool is NULL, right?
> >
> > That's another check; we don't want to make this more expensive.
>
> Ah, right, I missed that this is the one piece of KFENCE that is
> actually really hot code until Dmitry pointed that out.
>
> But actually, can't you reduce how hot this is for SLUB by moving
> is_kfence_address() down into the freeing slowpath? At the moment you
> use it in slab_free_freelist_hook(), which is in the super-hot
> fastpath, but you should be able to at least move it down into
> __slab_free()...
>
> Actually, you already have hooked into __slab_free(), so can't you
> just get rid of the check in the slab_free_freelist_hook()?
>
> Also, you could do the NULL *after* the range check said "true". That
> way the NULL check would be on the slowpath and have basically no
> performance impact.
True; let's try to do that then, and hope the few extra instructions do
not hurt us.
> > This should never receive a NULL, given the places it's used from, which
> > should only be allocator internals where we already know we have a
> > non-NULL object. If it did receive a NULL, I think something else is
> > wrong. Or did we miss a place where it can legally receive a NULL?
>
> Well... not exactly "legally", but e.g. a kernel NULL deref (landing
> in kfence_handle_page_fault()) might get weird.
>
> [...]
> > > > + access, use-after-free, and invalid-free errors. KFENCE is designed
> > > > + to have negligible cost to permit enabling it in production
> > > > + environments.
> > > [...]
> > > > diff --git a/mm/kfence/core.c b/mm/kfence/core.c
> > > [...]
> > > > +module_param_named(sample_interval, kfence_sample_interval, ulong, 0600);
> > >
> > > This is a writable module parameter, but if the sample interval was 0
> > > or a very large value, changing this value at runtime won't actually
> > > change the effective interval because the work item will never get
> > > kicked off again, right?
> >
> > When KFENCE has been enabled, setting this to 0 actually reschedules the
> > work immediately; we do not disable KFENCE once it has been enabled.
>
> Those are weird semantics. One value should IMO unambiguously mean one
> thing, independent of when it was set. In particular, I think that if
> someone decides to read the current value of kfence_sample_interval
> through sysfs, and sees the value "0", that should not ambiguously
> mean "either kfence triggers all the time or it is completely off".
>
> If you don't want to support runtime disabling, can you maybe make the
> handler refuse to write 0 if kfence has already been initialized?
I could live with 0 being rejected; will change it. (I personally had
used piping 0 at runtime to stress test, but perhaps if it's only devs
doing it we can just change the code for debugging/testing.)
> [...]
> > > > +#endif
> > > [...]
> > > > +/* Freelist with available objects. */
> > > > +static struct list_head kfence_freelist = LIST_HEAD_INIT(kfence_freelist);
> > > > +static DEFINE_RAW_SPINLOCK(kfence_freelist_lock); /* Lock protecting freelist. */
> > > [...]
> > > > +/* Gates the allocation, ensuring only one succeeds in a given period. */
> > > > +static atomic_t allocation_gate = ATOMIC_INIT(1);
> > >
> > > I don't think you need to initialize this to anything?
> > > toggle_allocation_gate() will set it to zero before enabling the
> > > static key, so I don't think anyone will ever see this value.
> >
> > Sure. But does it hurt anyone? At least this way we don't need to think
> > about yet another state that only exists on initialization; who knows
> > what we'll change in future.
>
> Well, no, it doesn't hurt. But I see this as equivalent to writing code like:
>
> int ret = 0;
> ret = -EINVAL;
> if (...)
> return ret;
>
> where a write can never have any effect because a second write will
> clobber the value before it can be read, which is IMO an antipattern.
Agree fully ^
Just being defensive with global states that can potentially be read for
other purposes before toggle_allocation_gate(); I think elsewhere you
e.g. suggested to use allocation_gate for the IPI optimization. It's
these types of changes that depend on our global states, where making
the initial state non-special just saves us trouble.
> But it admittedly is less clear here, so if you like it better your
> way, I don't really have a problem with that.
[...]
> [...]
> > > > +{
> > > > + unsigned long addr;
> > > > +
> > > > + lockdep_assert_held(&meta->lock);
> > > > +
> > > > + for (addr = ALIGN_DOWN(meta->addr, PAGE_SIZE); addr < meta->addr; addr++) {
> > > > + if (!fn((u8 *)addr))
> > > > + break;
> > > > + }
> > > > +
> > > > + for (addr = meta->addr + meta->size; addr < PAGE_ALIGN(meta->addr); addr++) {
> > >
> > > Hmm... if the object is on the left side (meaning meta->addr is
> > > page-aligned) and the padding is on the right side, won't
> > > PAGE_ALIGN(meta->addr)==meta->addr , and therefore none of the padding
> > > will be checked?
> >
> > No, you're thinking of ALIGN_DOWN. PAGE_ALIGN gives us the next page.
>
> Hm, really? Let me go through those macros...
>
>
> #define __AC(X,Y) (X##Y)
> #define _AC(X,Y) __AC(X,Y)
> #define PAGE_SHIFT 12
> #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT)
>
> so:
> PAGE_SIZE == (1UL << 12) == 0x1000UL
>
> #define __ALIGN_KERNEL_MASK(x, mask) (((x) + (mask)) & ~(mask))
> #define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1)
> #define ALIGN(x, a) __ALIGN_KERNEL((x), (a))
>
> so (omitting casts):
> ALIGN(x, a) == ((x + (a - 1)) & ~(a - 1))
>
> #define PAGE_ALIGN(addr) ALIGN(addr, PAGE_SIZE)
>
> so (omitting casts):
> PAGE_ALIGN(addr) == ((addr + (0x1000UL - 1)) & ~(0x1000UL - 1))
> == ((addr + 0xfffUL) & 0xfffffffffffff000UL)
>
> meaning that if we e.g. pass in 0x5000, we get:
>
> PAGE_ALIGN(0x5000) == ((0x5000 + 0xfffUL) & 0xfffffffffffff000UL)
> == 0x5fffUL & 0xfffffffffffff000UL == 0x5000UL
>
> So if the object is on the left side (meaning meta->addr is
> page-aligned), we won't check padding.
>
>
> ALIGN_DOWN rounds down, while PAGE_ALIGN rounds up, but both leave the
> value as-is if it is already page-aligned.
Ah, yes, sorry about that; I confused myself with the comment above PAGE_ALIGN.
We'll fix this. And add a test. :-)
Thanks,
-- Marco
next prev parent reply other threads:[~2020-10-02 21:13 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-29 13:38 [PATCH v4 00/11] KFENCE: A low-overhead sampling-based memory safety error detector Marco Elver
2020-09-29 13:38 ` [PATCH v4 01/11] mm: add Kernel Electric-Fence infrastructure Marco Elver
2020-10-02 6:33 ` Jann Horn
2020-10-02 7:53 ` Jann Horn
2020-10-02 14:22 ` Dmitry Vyukov
2020-10-02 15:06 ` Mark Rutland
2020-10-02 18:27 ` Jann Horn
2020-10-05 18:59 ` Marco Elver
2020-10-02 17:19 ` Marco Elver
2020-10-02 19:31 ` Jann Horn
2020-10-02 21:12 ` Marco Elver [this message]
2020-10-02 21:28 ` Marco Elver
2020-10-02 22:27 ` Jann Horn
2020-10-12 14:20 ` Marco Elver
2020-09-29 13:38 ` [PATCH v4 02/11] x86, kfence: enable KFENCE for x86 Marco Elver
2020-10-02 5:45 ` Jann Horn
2020-10-07 13:08 ` Marco Elver
2020-10-07 14:14 ` Jann Horn
2020-10-07 14:41 ` Marco Elver
2020-10-09 17:40 ` Marco Elver
2020-10-02 6:08 ` Jann Horn
2020-09-29 13:38 ` [PATCH v4 03/11] arm64, kfence: enable KFENCE for ARM64 Marco Elver
2020-10-02 6:47 ` Jann Horn
2020-10-02 14:18 ` Marco Elver
2020-10-02 16:10 ` Jann Horn
2020-09-29 13:38 ` [PATCH v4 04/11] mm, kfence: insert KFENCE hooks for SLAB Marco Elver
2020-09-29 13:38 ` [PATCH v4 05/11] mm, kfence: insert KFENCE hooks for SLUB Marco Elver
2020-10-02 7:07 ` Jann Horn
2020-10-05 9:29 ` Alexander Potapenko
2020-09-29 13:38 ` [PATCH v4 06/11] kfence, kasan: make KFENCE compatible with KASAN Marco Elver
2020-09-29 13:38 ` [PATCH v4 07/11] kfence, kmemleak: make KFENCE compatible with KMEMLEAK Marco Elver
2020-09-29 13:38 ` [PATCH v4 08/11] kfence, lockdep: make KFENCE compatible with lockdep Marco Elver
2020-09-29 13:38 ` [PATCH v4 09/11] kfence, Documentation: add KFENCE documentation Marco Elver
2020-09-29 13:38 ` [PATCH v4 10/11] kfence: add test suite Marco Elver
2020-09-29 13:38 ` [PATCH v4 11/11] MAINTAINERS: Add entry for KFENCE Marco Elver
2020-09-29 14:21 ` SeongJae Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201002211216.GA1108095@elver.google.com \
--to=elver@google.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=andreyknvl@google.com \
--cc=aryabinin@virtuozzo.com \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=cl@linux.com \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=dvyukov@google.com \
--cc=edumazet@google.com \
--cc=glider@google.com \
--cc=gregkh@linuxfoundation.org \
--cc=hdanton@sina.com \
--cc=hpa@zytor.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=jannh@google.com \
--cc=kasan-dev@googlegroups.com \
--cc=keescook@chromium.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=paulmck@kernel.org \
--cc=penberg@kernel.org \
--cc=peterz@infradead.org \
--cc=rientjes@google.com \
--cc=sjpark@amazon.com \
--cc=sjpark@amazon.de \
--cc=tglx@linutronix.de \
--cc=vbabka@suse.cz \
--cc=will@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).