linux-toolchains.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -tip] kasan: Emit different calls for instrumentable memintrinsics
@ 2023-02-08 18:42 Marco Elver
  2023-02-09 22:43 ` Andrey Konovalov
  2023-02-10 19:25 ` Jakub Jelinek
  0 siblings, 2 replies; 12+ messages in thread
From: Marco Elver @ 2023-02-08 18:42 UTC (permalink / raw)
  To: elver, Peter Zijlstra
  Cc: Masahiro Yamada, Nathan Chancellor, Nick Desaulniers,
	Nicolas Schier, Andrey Ryabinin, Alexander Potapenko,
	Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, linux-kbuild,
	kasan-dev, linux-kernel, Ingo Molnar, Tony Lindgren, Ulf Hansson,
	linux-toolchains

Clang 15 will provide an option to prefix calls to memcpy/memset/memmove
with __asan_ in instrumented functions: https://reviews.llvm.org/D122724

GCC does not yet have similar support.

Use it to regain KASAN instrumentation of memcpy/memset/memmove on
architectures that require noinstr to be really free from instrumented
mem*() functions (all GENERIC_ENTRY architectures).

Fixes: 69d4c0d32186 ("entry, kasan, x86: Disallow overriding mem*() functions")
Signed-off-by: Marco Elver <elver@google.com>
---

The Fixes tag is just there to show the dependency, and that people
shouldn't apply this patch without 69d4c0d32186.

---
 scripts/Makefile.kasan | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/scripts/Makefile.kasan b/scripts/Makefile.kasan
index b9e94c5e7097..78336b04c077 100644
--- a/scripts/Makefile.kasan
+++ b/scripts/Makefile.kasan
@@ -38,6 +38,13 @@ endif
 
 CFLAGS_KASAN += $(call cc-param,asan-stack=$(stack_enable))
 
+ifdef CONFIG_GENERIC_ENTRY
+# Instrument memcpy/memset/memmove calls by using instrumented __asan_mem*()
+# instead. With compilers that don't support this option, compiler-inserted
+# memintrinsics won't be checked by KASAN.
+CFLAGS_KASAN += $(call cc-param,asan-kernel-mem-intrinsic-prefix)
+endif
+
 endif # CONFIG_KASAN_GENERIC
 
 ifdef CONFIG_KASAN_SW_TAGS
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH -tip] kasan: Emit different calls for instrumentable memintrinsics
  2023-02-08 18:42 [PATCH -tip] kasan: Emit different calls for instrumentable memintrinsics Marco Elver
@ 2023-02-09 22:43 ` Andrey Konovalov
  2023-02-09 23:34   ` Marco Elver
  2023-02-10 19:25 ` Jakub Jelinek
  1 sibling, 1 reply; 12+ messages in thread
From: Andrey Konovalov @ 2023-02-09 22:43 UTC (permalink / raw)
  To: Marco Elver
  Cc: Peter Zijlstra, Masahiro Yamada, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Vincenzo Frascino,
	linux-kbuild, kasan-dev, linux-kernel, Ingo Molnar,
	Tony Lindgren, Ulf Hansson, linux-toolchains

On Wed, Feb 8, 2023 at 7:42 PM Marco Elver <elver@google.com> wrote:
>
> Clang 15 will provide an option to prefix calls to memcpy/memset/memmove
> with __asan_ in instrumented functions: https://reviews.llvm.org/D122724

Hi Marco,

Does this option affect all functions or only the ones that are marked
with no_sanitize?

Based on the LLVM patch description, should we also change the normal
memcpy/memset/memmove to be noninstrumented?

These __asan_mem* functions are not defined in the kernel AFAICS.
Should we add them?

Or maybe we should just use "__" as the prefix, as right now __mem*
functions are the ones that are not instrumented?

Thanks!

> GCC does not yet have similar support.
>
> Use it to regain KASAN instrumentation of memcpy/memset/memmove on
> architectures that require noinstr to be really free from instrumented
> mem*() functions (all GENERIC_ENTRY architectures).
>
> Fixes: 69d4c0d32186 ("entry, kasan, x86: Disallow overriding mem*() functions")
> Signed-off-by: Marco Elver <elver@google.com>
> ---
>
> The Fixes tag is just there to show the dependency, and that people
> shouldn't apply this patch without 69d4c0d32186.
>
> ---
>  scripts/Makefile.kasan | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/scripts/Makefile.kasan b/scripts/Makefile.kasan
> index b9e94c5e7097..78336b04c077 100644
> --- a/scripts/Makefile.kasan
> +++ b/scripts/Makefile.kasan
> @@ -38,6 +38,13 @@ endif
>
>  CFLAGS_KASAN += $(call cc-param,asan-stack=$(stack_enable))
>
> +ifdef CONFIG_GENERIC_ENTRY
> +# Instrument memcpy/memset/memmove calls by using instrumented __asan_mem*()
> +# instead. With compilers that don't support this option, compiler-inserted
> +# memintrinsics won't be checked by KASAN.
> +CFLAGS_KASAN += $(call cc-param,asan-kernel-mem-intrinsic-prefix)
> +endif
> +
>  endif # CONFIG_KASAN_GENERIC
>
>  ifdef CONFIG_KASAN_SW_TAGS
> --
> 2.39.1.519.gcb327c4b5f-goog
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH -tip] kasan: Emit different calls for instrumentable memintrinsics
  2023-02-09 22:43 ` Andrey Konovalov
@ 2023-02-09 23:34   ` Marco Elver
  2023-02-10 16:13     ` Andrey Konovalov
  0 siblings, 1 reply; 12+ messages in thread
From: Marco Elver @ 2023-02-09 23:34 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: Peter Zijlstra, Masahiro Yamada, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Vincenzo Frascino,
	linux-kbuild, kasan-dev, linux-kernel, Ingo Molnar,
	Tony Lindgren, Ulf Hansson, linux-toolchains, Mark Rutland

On Thu, 9 Feb 2023 at 23:43, Andrey Konovalov <andreyknvl@gmail.com> wrote:
>
> On Wed, Feb 8, 2023 at 7:42 PM Marco Elver <elver@google.com> wrote:
> >
> > Clang 15 will provide an option to prefix calls to memcpy/memset/memmove
> > with __asan_ in instrumented functions: https://reviews.llvm.org/D122724
>
> Hi Marco,
>
> Does this option affect all functions or only the ones that are marked
> with no_sanitize?

Only functions that are instrumented, i.e. wherever
fsanitize=kernel-address inserts instrumentation.

> Based on the LLVM patch description, should we also change the normal
> memcpy/memset/memmove to be noninstrumented?

They are no longer instrumented as of 69d4c0d32186 (for
CONFIG_GENERIC_ENTRY arches).

> These __asan_mem* functions are not defined in the kernel AFAICS.
> Should we add them?

Peter introduced them in 69d4c0d32186, and we effectively have no
mem*() instrumentation on x86 w/o the compiler-enablement patch here.

> Or maybe we should just use "__" as the prefix, as right now __mem*
> functions are the ones that are not instrumented?

__asan_mem* is for instrumented code, just like ASan userspace does
(actually ASan userspace has been doing it like this forever, just the
kernel was somehow special).

[...]
> > Fixes: 69d4c0d32186 ("entry, kasan, x86: Disallow overriding mem*() functions")
> > Signed-off-by: Marco Elver <elver@google.com>
> > ---
> >
> > The Fixes tag is just there to show the dependency, and that people
> > shouldn't apply this patch without 69d4c0d32186.

^^^ Depends on this commit, which is only in -tip.

> > +ifdef CONFIG_GENERIC_ENTRY

It also only affects GENERIC_ENTRY arches.

> > +# Instrument memcpy/memset/memmove calls by using instrumented __asan_mem*()
> > +# instead. With compilers that don't support this option, compiler-inserted
> > +# memintrinsics won't be checked by KASAN.
> > +CFLAGS_KASAN += $(call cc-param,asan-kernel-mem-intrinsic-prefix)
> > +endif

Probably the same should be done for SW_TAGS, because arm64 will be
GENERIC_ENTRY at one point or another as well.

KASAN + GCC on x86 will have no mem*() instrumentation after
69d4c0d32186, which is sad, so somebody ought to teach it the same
param as above.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH -tip] kasan: Emit different calls for instrumentable memintrinsics
  2023-02-09 23:34   ` Marco Elver
@ 2023-02-10 16:13     ` Andrey Konovalov
  2023-02-10 18:40       ` Marco Elver
  0 siblings, 1 reply; 12+ messages in thread
From: Andrey Konovalov @ 2023-02-10 16:13 UTC (permalink / raw)
  To: Marco Elver
  Cc: Peter Zijlstra, Masahiro Yamada, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Vincenzo Frascino,
	linux-kbuild, kasan-dev, linux-kernel, Ingo Molnar,
	Tony Lindgren, Ulf Hansson, linux-toolchains, Mark Rutland

On Fri, Feb 10, 2023 at 12:35 AM Marco Elver <elver@google.com> wrote:
>
> On Thu, 9 Feb 2023 at 23:43, Andrey Konovalov <andreyknvl@gmail.com> wrote:
> >
> > On Wed, Feb 8, 2023 at 7:42 PM Marco Elver <elver@google.com> wrote:
> > >
> > > Clang 15 will provide an option to prefix calls to memcpy/memset/memmove
> > > with __asan_ in instrumented functions: https://reviews.llvm.org/D122724
> >
> > Hi Marco,
> >
> > Does this option affect all functions or only the ones that are marked
> > with no_sanitize?
>
> Only functions that are instrumented, i.e. wherever
> fsanitize=kernel-address inserts instrumentation.

Ack.

> > Based on the LLVM patch description, should we also change the normal
> > memcpy/memset/memmove to be noninstrumented?
>
> They are no longer instrumented as of 69d4c0d32186 (for
> CONFIG_GENERIC_ENTRY arches).

Ah, sorry, overlooked that.

> > These __asan_mem* functions are not defined in the kernel AFAICS.
> > Should we add them?
>
> Peter introduced them in 69d4c0d32186, and we effectively have no
> mem*() instrumentation on x86 w/o the compiler-enablement patch here.
>
> > Or maybe we should just use "__" as the prefix, as right now __mem*
> > functions are the ones that are not instrumented?
>
> __asan_mem* is for instrumented code, just like ASan userspace does
> (actually ASan userspace has been doing it like this forever, just the
> kernel was somehow special).
>
> [...]
> > > Fixes: 69d4c0d32186 ("entry, kasan, x86: Disallow overriding mem*() functions")
> > > Signed-off-by: Marco Elver <elver@google.com>
> > > ---
> > >
> > > The Fixes tag is just there to show the dependency, and that people
> > > shouldn't apply this patch without 69d4c0d32186.
>
> ^^^ Depends on this commit, which is only in -tip.

Got it. Missed that patch.

> > > +ifdef CONFIG_GENERIC_ENTRY
>
> It also only affects GENERIC_ENTRY arches.
>
> > > +# Instrument memcpy/memset/memmove calls by using instrumented __asan_mem*()
> > > +# instead. With compilers that don't support this option, compiler-inserted
> > > +# memintrinsics won't be checked by KASAN.
> > > +CFLAGS_KASAN += $(call cc-param,asan-kernel-mem-intrinsic-prefix)
> > > +endif
>
> Probably the same should be done for SW_TAGS, because arm64 will be
> GENERIC_ENTRY at one point or another as well.

Yes, makes sense. I'll file a bug for this once I fully understand the
consequences of these changes.

> KASAN + GCC on x86 will have no mem*() instrumentation after
> 69d4c0d32186, which is sad, so somebody ought to teach it the same
> param as above.

Hm, with that patch we would have no KASAN checking within normal mem*
functions (not the ones embedded by the compiler) on GENERIC_ENTRY
arches even with Clang, right?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH -tip] kasan: Emit different calls for instrumentable memintrinsics
  2023-02-10 16:13     ` Andrey Konovalov
@ 2023-02-10 18:40       ` Marco Elver
  2023-02-10 21:36         ` Andrey Konovalov
  0 siblings, 1 reply; 12+ messages in thread
From: Marco Elver @ 2023-02-10 18:40 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: Peter Zijlstra, Masahiro Yamada, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Vincenzo Frascino,
	linux-kbuild, kasan-dev, linux-kernel, Ingo Molnar,
	Tony Lindgren, Ulf Hansson, linux-toolchains, Mark Rutland

On Fri, 10 Feb 2023 at 17:13, Andrey Konovalov <andreyknvl@gmail.com> wrote:
[...]
> > Probably the same should be done for SW_TAGS, because arm64 will be
> > GENERIC_ENTRY at one point or another as well.
>
> Yes, makes sense. I'll file a bug for this once I fully understand the
> consequences of these changes.
>
> > KASAN + GCC on x86 will have no mem*() instrumentation after
> > 69d4c0d32186, which is sad, so somebody ought to teach it the same
> > param as above.
>
> Hm, with that patch we would have no KASAN checking within normal mem*
> functions (not the ones embedded by the compiler) on GENERIC_ENTRY
> arches even with Clang, right?

Yes, that's the point - normal mem*() functions cannot be instrumented
with GENERIC_ENTRY within noinstr functions, because the compiler
sometimes decides to transform normal assignments into
memcpy()/memset(). And if mem*() were instrumented (as it was before
69d4c0d32186), that'd break things for these architectures.

But since most code is normally instrumented, with the right compiler
support (which the patch here enables), we just turn mem*() in
instrumented functions into __asan_mem*(), and get the instrumentation
as before. 69d4c0d32186 already added those __asan functions. The fact
that KASAN used to override mem*() is just the wrong choice in a world
where compilers decide to inline or outline these. From an
instrumentation point of view at the compiler level, we need to treat
them like any other instrumentable instruction (loads, stores,
atomics, etc.): transform each instrumentable instruction into
something that does the right checks. Only then can we be sure that we
don't accidentally instrument something that shouldn't be (noinstr
functions), because instead of relying on the compiler, we forced
instrumentation on every mem*().

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH -tip] kasan: Emit different calls for instrumentable memintrinsics
  2023-02-08 18:42 [PATCH -tip] kasan: Emit different calls for instrumentable memintrinsics Marco Elver
  2023-02-09 22:43 ` Andrey Konovalov
@ 2023-02-10 19:25 ` Jakub Jelinek
  2023-02-10 20:07   ` Marco Elver
  1 sibling, 1 reply; 12+ messages in thread
From: Jakub Jelinek @ 2023-02-10 19:25 UTC (permalink / raw)
  To: Marco Elver
  Cc: Peter Zijlstra, Masahiro Yamada, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Vincenzo Frascino, linux-kbuild, kasan-dev, linux-kernel,
	Ingo Molnar, Tony Lindgren, Ulf Hansson, linux-toolchains

On Wed, Feb 08, 2023 at 07:42:03PM +0100, Marco Elver wrote:
> Clang 15 will provide an option to prefix calls to memcpy/memset/memmove
> with __asan_ in instrumented functions: https://reviews.llvm.org/D122724
> 
> GCC does not yet have similar support.

GCC has support to rename memcpy/memset etc. for years, say on
following compiled with
-fsanitize=kernel-address -O2 -mstringop-strategy=libcall
(the last option just to make sure the compiler doesn't prefer to emit
rep mov*/stos* or loop or something similar, of course kernel can keep
whatever it uses) you'll get just __asan_memcpy/__asan_memset calls,
no memcpy/memset, while without -fsanitize=kernel-address you get
normally memcpy/memset.
Or do you need the __asan_* functions only in asan instrumented functions
and normal ones in non-instrumented functions in the same TU?

#ifdef __SANITIZE_ADDRESS__
extern __typeof (__builtin_memcpy) memcpy __asm ("__asan_memcpy");
extern __typeof (__builtin_memset) memset __asm ("__asan_memset");
#endif
struct S { char a[2048]; } a, b;

void
foo (void)
{
  a = b;
  b = (struct S) {};
}

void
bar (void *p, void *q, int s)
{
  memcpy (p, q, s);
}

void
baz (void *p, int c, int s)
{
  memset (p, c, s);
}

void
qux (void *p, void *q, int s)
{
  __builtin_memcpy (p, q, s);
}

void
quux (void *p, int c, int s)
{
  __builtin_memset (p, c, s);
}

	Jakub


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH -tip] kasan: Emit different calls for instrumentable memintrinsics
  2023-02-10 19:25 ` Jakub Jelinek
@ 2023-02-10 20:07   ` Marco Elver
  2023-02-13 11:01     ` Jakub Jelinek
  0 siblings, 1 reply; 12+ messages in thread
From: Marco Elver @ 2023-02-10 20:07 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Peter Zijlstra, Masahiro Yamada, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Vincenzo Frascino, linux-kbuild, kasan-dev, linux-kernel,
	Ingo Molnar, Tony Lindgren, Ulf Hansson, linux-toolchains

On Fri, 10 Feb 2023 at 20:25, Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Wed, Feb 08, 2023 at 07:42:03PM +0100, Marco Elver wrote:
> > Clang 15 will provide an option to prefix calls to memcpy/memset/memmove
> > with __asan_ in instrumented functions: https://reviews.llvm.org/D122724
> >
> > GCC does not yet have similar support.
>
> GCC has support to rename memcpy/memset etc. for years, say on
> following compiled with
> -fsanitize=kernel-address -O2 -mstringop-strategy=libcall
> (the last option just to make sure the compiler doesn't prefer to emit
> rep mov*/stos* or loop or something similar, of course kernel can keep
> whatever it uses) you'll get just __asan_memcpy/__asan_memset calls,
> no memcpy/memset, while without -fsanitize=kernel-address you get
> normally memcpy/memset.

> Or do you need the __asan_* functions only in asan instrumented functions
> and normal ones in non-instrumented functions in the same TU?

Yes, exactly that: __asan_ in instrumented, and normal ones in
no_sanitize functions; they can be mixed in the same TU. We can't
rename normal mem*() functions everywhere. In no_sanitize functions
(in particular noinstr), normal mem*() should be used. But in
instrumented code, it should be __asan_mem*(). Another longer
explanation I also just replied here:
https://lore.kernel.org/all/CANpmjNNH-O+38U6zRWJUCU-eJTfMhUosy==GWEOn1vcu=J2dcw@mail.gmail.com/

At least clang has had this behaviour for user space ASan forever:
https://godbolt.org/z/h5sWExzef - so it was easy to just add the flag
to make it behave like in user space for mem*() in the kernel. It
might also be worthwhile for GCC to emit __asan_ for user space, given
that the runtimes are shared and the user space runtime definitely has
__asan_. The kernel needs the param (asan-kernel-mem-intrinsic-prefix)
though, to not break older kernels.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH -tip] kasan: Emit different calls for instrumentable memintrinsics
  2023-02-10 18:40       ` Marco Elver
@ 2023-02-10 21:36         ` Andrey Konovalov
  2023-02-13  7:00           ` Marco Elver
  0 siblings, 1 reply; 12+ messages in thread
From: Andrey Konovalov @ 2023-02-10 21:36 UTC (permalink / raw)
  To: Marco Elver
  Cc: Peter Zijlstra, Masahiro Yamada, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Vincenzo Frascino,
	linux-kbuild, kasan-dev, linux-kernel, Ingo Molnar,
	Tony Lindgren, Ulf Hansson, linux-toolchains, Mark Rutland

On Fri, Feb 10, 2023 at 7:41 PM Marco Elver <elver@google.com> wrote:
>
> On Fri, 10 Feb 2023 at 17:13, Andrey Konovalov <andreyknvl@gmail.com> wrote:
> [...]
> > > Probably the same should be done for SW_TAGS, because arm64 will be
> > > GENERIC_ENTRY at one point or another as well.
> >
> > Yes, makes sense. I'll file a bug for this once I fully understand the
> > consequences of these changes.
> >
> > > KASAN + GCC on x86 will have no mem*() instrumentation after
> > > 69d4c0d32186, which is sad, so somebody ought to teach it the same
> > > param as above.
> >
> > Hm, with that patch we would have no KASAN checking within normal mem*
> > functions (not the ones embedded by the compiler) on GENERIC_ENTRY
> > arches even with Clang, right?
>
> Yes, that's the point - normal mem*() functions cannot be instrumented
> with GENERIC_ENTRY within noinstr functions, because the compiler
> sometimes decides to transform normal assignments into
> memcpy()/memset(). And if mem*() were instrumented (as it was before
> 69d4c0d32186), that'd break things for these architectures.
>
> But since most code is normally instrumented, with the right compiler
> support (which the patch here enables), we just turn mem*() in
> instrumented functions into __asan_mem*(), and get the instrumentation
> as before. 69d4c0d32186 already added those __asan functions. The fact
> that KASAN used to override mem*() is just the wrong choice in a world
> where compilers decide to inline or outline these. From an
> instrumentation point of view at the compiler level, we need to treat
> them like any other instrumentable instruction (loads, stores,
> atomics, etc.): transform each instrumentable instruction into
> something that does the right checks. Only then can we be sure that we
> don't accidentally instrument something that shouldn't be (noinstr
> functions), because instead of relying on the compiler, we forced
> instrumentation on every mem*().

I meant to ask whether the normal mem* calls from instrumented
functions will also be transformed to __asan_mem*() by the compiler.
But following the godbolt link you shared, I see that this is true.

Thank you for the explanation!

So the overall negative impact of these changes is that we don't get
KASAN checking in both normal mem* calls and the ones formed by
transforming assignments for GENERIC_ENTRY architectures with GCC and
with older Clang. This is not great. I wonder if we then need to print
some kind of warning when the kernel is built with these compilers.

If these changes move forward, AFAIU, we can also drop these custom
mem* definitions for non-instrumented files for x86:

https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/string_64.h#L88

Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH -tip] kasan: Emit different calls for instrumentable memintrinsics
  2023-02-10 21:36         ` Andrey Konovalov
@ 2023-02-13  7:00           ` Marco Elver
  0 siblings, 0 replies; 12+ messages in thread
From: Marco Elver @ 2023-02-13  7:00 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: Peter Zijlstra, Masahiro Yamada, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Vincenzo Frascino,
	linux-kbuild, kasan-dev, linux-kernel, Ingo Molnar,
	Tony Lindgren, Ulf Hansson, linux-toolchains, Mark Rutland

On Fri, 10 Feb 2023 at 22:37, Andrey Konovalov <andreyknvl@gmail.com> wrote:
>
> On Fri, Feb 10, 2023 at 7:41 PM Marco Elver <elver@google.com> wrote:
> >
> > On Fri, 10 Feb 2023 at 17:13, Andrey Konovalov <andreyknvl@gmail.com> wrote:
> > [...]
> > > > Probably the same should be done for SW_TAGS, because arm64 will be
> > > > GENERIC_ENTRY at one point or another as well.
> > >
> > > Yes, makes sense. I'll file a bug for this once I fully understand the
> > > consequences of these changes.
> > >
> > > > KASAN + GCC on x86 will have no mem*() instrumentation after
> > > > 69d4c0d32186, which is sad, so somebody ought to teach it the same
> > > > param as above.
> > >
> > > Hm, with that patch we would have no KASAN checking within normal mem*
> > > functions (not the ones embedded by the compiler) on GENERIC_ENTRY
> > > arches even with Clang, right?
> >
> > Yes, that's the point - normal mem*() functions cannot be instrumented
> > with GENERIC_ENTRY within noinstr functions, because the compiler
> > sometimes decides to transform normal assignments into
> > memcpy()/memset(). And if mem*() were instrumented (as it was before
> > 69d4c0d32186), that'd break things for these architectures.
> >
> > But since most code is normally instrumented, with the right compiler
> > support (which the patch here enables), we just turn mem*() in
> > instrumented functions into __asan_mem*(), and get the instrumentation
> > as before. 69d4c0d32186 already added those __asan functions. The fact
> > that KASAN used to override mem*() is just the wrong choice in a world
> > where compilers decide to inline or outline these. From an
> > instrumentation point of view at the compiler level, we need to treat
> > them like any other instrumentable instruction (loads, stores,
> > atomics, etc.): transform each instrumentable instruction into
> > something that does the right checks. Only then can we be sure that we
> > don't accidentally instrument something that shouldn't be (noinstr
> > functions), because instead of relying on the compiler, we forced
> > instrumentation on every mem*().
>
> I meant to ask whether the normal mem* calls from instrumented
> functions will also be transformed to __asan_mem*() by the compiler.
> But following the godbolt link you shared, I see that this is true.
>
> Thank you for the explanation!
>
> So the overall negative impact of these changes is that we don't get
> KASAN checking in both normal mem* calls and the ones formed by
> transforming assignments for GENERIC_ENTRY architectures with GCC and
> with older Clang. This is not great. I wonder if we then need to print
> some kind of warning when the kernel is built with these compilers.

Since these changes are already in -tip, and by judging from [1],
there really is no other way. As-is, KASAN on x86 is already broken
per [1] (though we got lucky thus far).

Printing a warning wouldn't hurt, but I think nobody would notice the
warning, and if somebody notices, they wouldn't care. Sooner or later,
we just need to make sure that test robots (syzbot, etc.) have new
compilers.

> If these changes move forward, AFAIU, we can also drop these custom
> mem* definitions for non-instrumented files for x86:
>
> https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/string_64.h#L88

Yes, I think so.

[1] https://lore.kernel.org/all/20230112194314.845371875@infradead.org/

Last but not least are you ok with this patch? This patch ought to be
applied to the same tree as 69d4c0d32186 anyway, so this patch lives
or dies by that change.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH -tip] kasan: Emit different calls for instrumentable memintrinsics
  2023-02-10 20:07   ` Marco Elver
@ 2023-02-13 11:01     ` Jakub Jelinek
  2023-02-13 12:35       ` Peter Zijlstra
  0 siblings, 1 reply; 12+ messages in thread
From: Jakub Jelinek @ 2023-02-13 11:01 UTC (permalink / raw)
  To: Marco Elver
  Cc: Peter Zijlstra, Masahiro Yamada, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Vincenzo Frascino, linux-kbuild, kasan-dev, linux-kernel,
	Ingo Molnar, Tony Lindgren, Ulf Hansson, linux-toolchains

On Fri, Feb 10, 2023 at 09:07:14PM +0100, Marco Elver wrote:
> On Fri, 10 Feb 2023 at 20:25, Jakub Jelinek <jakub@redhat.com> wrote:
> >
> > On Wed, Feb 08, 2023 at 07:42:03PM +0100, Marco Elver wrote:
> > > Clang 15 will provide an option to prefix calls to memcpy/memset/memmove
> > > with __asan_ in instrumented functions: https://reviews.llvm.org/D122724
> > >
> > > GCC does not yet have similar support.
> >
> > GCC has support to rename memcpy/memset etc. for years, say on
> > following compiled with
> > -fsanitize=kernel-address -O2 -mstringop-strategy=libcall
> > (the last option just to make sure the compiler doesn't prefer to emit
> > rep mov*/stos* or loop or something similar, of course kernel can keep
> > whatever it uses) you'll get just __asan_memcpy/__asan_memset calls,
> > no memcpy/memset, while without -fsanitize=kernel-address you get
> > normally memcpy/memset.
> 
> > Or do you need the __asan_* functions only in asan instrumented functions
> > and normal ones in non-instrumented functions in the same TU?
> 
> Yes, exactly that: __asan_ in instrumented, and normal ones in
> no_sanitize functions; they can be mixed in the same TU. We can't
> rename normal mem*() functions everywhere. In no_sanitize functions
> (in particular noinstr), normal mem*() should be used. But in
> instrumented code, it should be __asan_mem*(). Another longer
> explanation I also just replied here:
> https://lore.kernel.org/all/CANpmjNNH-O+38U6zRWJUCU-eJTfMhUosy==GWEOn1vcu=J2dcw@mail.gmail.com/
> 
> At least clang has had this behaviour for user space ASan forever:
> https://godbolt.org/z/h5sWExzef - so it was easy to just add the flag
> to make it behave like in user space for mem*() in the kernel. It
> might also be worthwhile for GCC to emit __asan_ for user space, given
> that the runtimes are shared and the user space runtime definitely has
> __asan_. The kernel needs the param (asan-kernel-mem-intrinsic-prefix)
> though, to not break older kernels.

So, what exactly you want for gcc to do with
--param asan-kernel-mem-intrinsic-prefix=1 (note, in GCC case it can't be
without the =1 at the end)?

The current gcc behavior is that operations like aggregate copies, or
clearing which might or might not need memcpy/memset/memmove under the hood
later are asan instrumented before the operation (in order not to limit the
choices on how it will be expanded), uses of builtins (__builtin_ prefixed
or not) are also instrumented before the calls unless they are one of the
calls that is recognized as always instrumented.  None for hwasan,
for asan:
index, memchr, memcmp, memcpy, memmove, memset, strcasecmp, strcat, strchr,
strcmp, strcpy, strdup, strlen, strncasecmp, strncat, strncmp, strcspn,
strpbrk, strspn, strstr, strncpy
and for those builtins gcc disables inline expansion and enforces a library
call (but until the expansion they are treated in optimizations like normal
builtins and so could be say DCEd, or their aliasing behavior is considered
etc.).  kasan behaves the same I think.

Now, I think libasan only has __asan_ prefixed
__asan_memmove, __asan_memset and __asan_memcpy, nothing else, so most of
the calls from the above list even can't be prefixed.

So, do you want for --param asan-kernel-mem-intrinsic-prefix=1 to __asan_
prefix just memcpy/memmove/memset and nothing else?  Is it ok to emit
memcpy/memset/memmove from aggregate operations which are instrumented
already at the caller (and similarly is it ok to handle those operations
inline)?

	Jakub


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH -tip] kasan: Emit different calls for instrumentable memintrinsics
  2023-02-13 11:01     ` Jakub Jelinek
@ 2023-02-13 12:35       ` Peter Zijlstra
  2023-02-13 13:37         ` Marco Elver
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2023-02-13 12:35 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Marco Elver, Masahiro Yamada, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Vincenzo Frascino, linux-kbuild, kasan-dev, linux-kernel,
	Ingo Molnar, Tony Lindgren, Ulf Hansson, linux-toolchains

On Mon, Feb 13, 2023 at 12:01:40PM +0100, Jakub Jelinek wrote:

> The current gcc behavior is that operations like aggregate copies, or
> clearing which might or might not need memcpy/memset/memmove under the hood
> later are asan instrumented before the operation (in order not to limit the
> choices on how it will be expanded), uses of builtins (__builtin_ prefixed
> or not) are also instrumented before the calls unless they are one of the
> calls that is recognized as always instrumented.  None for hwasan,
> for asan:
> index, memchr, memcmp, memcpy, memmove, memset, strcasecmp, strcat, strchr,
> strcmp, strcpy, strdup, strlen, strncasecmp, strncat, strncmp, strcspn,
> strpbrk, strspn, strstr, strncpy
> and for those builtins gcc disables inline expansion and enforces a library
> call (but until the expansion they are treated in optimizations like normal
> builtins and so could be say DCEd, or their aliasing behavior is considered
> etc.).  kasan behaves the same I think.
> 
> Now, I think libasan only has __asan_ prefixed
> __asan_memmove, __asan_memset and __asan_memcpy, nothing else, so most of
> the calls from the above list even can't be prefixed.
> 
> So, do you want for --param asan-kernel-mem-intrinsic-prefix=1 to __asan_
> prefix just memcpy/memmove/memset and nothing else?  Is it ok to emit
> memcpy/memset/memmove from aggregate operations which are instrumented
> already at the caller (and similarly is it ok to handle those operations
> inline)?

I'm thinking it is trivial to add more __asan prefixed functions as
needed, while trying to untangle the trainwreck created by assuming the
normal functions are instrumented is much more work.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH -tip] kasan: Emit different calls for instrumentable memintrinsics
  2023-02-13 12:35       ` Peter Zijlstra
@ 2023-02-13 13:37         ` Marco Elver
  0 siblings, 0 replies; 12+ messages in thread
From: Marco Elver @ 2023-02-13 13:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jakub Jelinek, Masahiro Yamada, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Vincenzo Frascino, linux-kbuild, kasan-dev, linux-kernel,
	Ingo Molnar, Tony Lindgren, Ulf Hansson, linux-toolchains

On Mon, 13 Feb 2023 at 13:36, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, Feb 13, 2023 at 12:01:40PM +0100, Jakub Jelinek wrote:
>
> > The current gcc behavior is that operations like aggregate copies, or
> > clearing which might or might not need memcpy/memset/memmove under the hood
> > later are asan instrumented before the operation (in order not to limit the
> > choices on how it will be expanded), uses of builtins (__builtin_ prefixed
> > or not) are also instrumented before the calls unless they are one of the
> > calls that is recognized as always instrumented.  None for hwasan,
> > for asan:
> > index, memchr, memcmp, memcpy, memmove, memset, strcasecmp, strcat, strchr,
> > strcmp, strcpy, strdup, strlen, strncasecmp, strncat, strncmp, strcspn,
> > strpbrk, strspn, strstr, strncpy
> > and for those builtins gcc disables inline expansion and enforces a library
> > call (but until the expansion they are treated in optimizations like normal
> > builtins and so could be say DCEd, or their aliasing behavior is considered
> > etc.).  kasan behaves the same I think.
> >
> > Now, I think libasan only has __asan_ prefixed
> > __asan_memmove, __asan_memset and __asan_memcpy, nothing else, so most of
> > the calls from the above list even can't be prefixed.

Correct, right now libasan only does memmove, memset, and memcpy. I
don't think it'll ever do more, at least not in the near future.

> > So, do you want for --param asan-kernel-mem-intrinsic-prefix=1 to __asan_
> > prefix just memcpy/memmove/memset and nothing else?

Yes.

> > Is it ok to emit
> > memcpy/memset/memmove from aggregate operations which are instrumented
> > already at the caller (and similarly is it ok to handle those operations
> > inline)?

Yes, I think that's fair.

> I'm thinking it is trivial to add more __asan prefixed functions as
> needed, while trying to untangle the trainwreck created by assuming the
> normal functions are instrumented is much more work.

For the kernel param, I'd only do memcpy/memmove/memset, as those are
the most brittle ones. The string functions are instrumented on most
architectures through lib/string.c being instrumented.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-02-13 13:38 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-08 18:42 [PATCH -tip] kasan: Emit different calls for instrumentable memintrinsics Marco Elver
2023-02-09 22:43 ` Andrey Konovalov
2023-02-09 23:34   ` Marco Elver
2023-02-10 16:13     ` Andrey Konovalov
2023-02-10 18:40       ` Marco Elver
2023-02-10 21:36         ` Andrey Konovalov
2023-02-13  7:00           ` Marco Elver
2023-02-10 19:25 ` Jakub Jelinek
2023-02-10 20:07   ` Marco Elver
2023-02-13 11:01     ` Jakub Jelinek
2023-02-13 12:35       ` Peter Zijlstra
2023-02-13 13:37         ` Marco Elver

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).