linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ubsan: disable UBSAN_DIV_ZERO for clang
@ 2022-07-14 20:56 Nick Desaulniers
  2022-07-14 21:24 ` Linus Torvalds
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Nick Desaulniers @ 2022-07-14 20:56 UTC (permalink / raw)
  To: Kees Cook
  Cc: Nick Desaulniers, Sudip Mukherjee, Linus Torvalds,
	Nathan Chancellor, Tom Rix, Marco Elver, Andrew Morton,
	Josh Poimboeuf, Peter Zijlstra (Intel),
	linux-kernel, llvm

Building with UBSAN_DIV_ZERO with clang produces numerous fallthrough
warnings from objtool.

In the case of uncheck division, UBSAN_DIV_ZERO may introduce new
control flow to check for division by zero. Because the result of the
division is undefined, LLVM may optimize the control flow such that
after the call to __ubsan_handle_divrem_overflow doesn't matter. If
panic_on_warn was set, __ubsan_handle_divrem_overflow would panic. The
problem is is that panic_on_warn is run time configurable. If it's
disabled, then we cannot guarantee that we will be able to recover
safely.  Disable this config for clang until we can come up with a
solution in LLVM.

Link: https://github.com/ClangBuiltLinux/linux/issues/1657
Link: https://github.com/llvm/llvm-project/issues/56289
Link: https://lore.kernel.org/lkml/CAHk-=wj1qhf7y3VNACEexyp5EbkNpdcu_542k-xZpzmYLOjiCg@mail.gmail.com/
Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
Linus,
I still think we should add explicit checks to gaurd against divide by
zero.

 lib/Kconfig.ubsan | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/Kconfig.ubsan b/lib/Kconfig.ubsan
index a9f7eb047768..fd15230a703b 100644
--- a/lib/Kconfig.ubsan
+++ b/lib/Kconfig.ubsan
@@ -84,6 +84,9 @@ config UBSAN_SHIFT
 config UBSAN_DIV_ZERO
 	bool "Perform checking for integer divide-by-zero"
 	depends on $(cc-option,-fsanitize=integer-divide-by-zero)
+	# https://github.com/ClangBuiltLinux/linux/issues/1657
+	# https://github.com/llvm/llvm-project/issues/56289
+	depends on !CC_IS_CLANG
 	help
 	  This option enables -fsanitize=integer-divide-by-zero which checks
 	  for integer division by zero. This is effectively redundant with the
-- 
2.37.0.170.g444d1eabd0-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] ubsan: disable UBSAN_DIV_ZERO for clang
  2022-07-14 20:56 [PATCH] ubsan: disable UBSAN_DIV_ZERO for clang Nick Desaulniers
@ 2022-07-14 21:24 ` Linus Torvalds
  2022-07-14 21:38   ` Nick Desaulniers
  2022-07-14 21:24 ` Nathan Chancellor
  2022-07-14 23:15 ` Linus Torvalds
  2 siblings, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2022-07-14 21:24 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Kees Cook, Sudip Mukherjee, Nathan Chancellor, Tom Rix,
	Marco Elver, Andrew Morton, Josh Poimboeuf,
	Peter Zijlstra (Intel),
	Linux Kernel Mailing List, clang-built-linux

On Thu, Jul 14, 2022 at 1:56 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> Linus,
> I still think we should add explicit checks to gaurd against divide by
> zero.

I mean, that's what UBSAN_DIV_ZERO is supposed to do.

The fact that clang then messes it up, and turns "I found undefined
behavior" into "I just crashed the machine" is why it needs to be
disabled.

Please conmvince clang people to fix the sanitizer.

  san·i·tize
  /ˈsanəˌtīz/
  verb
  make clean and hygienic; disinfect.

note how "sanitize" is meant to clean things of undefined behavior.

The way you do that is by warning, and giving it defined behavior. It
really is that simple.

Clang seems to warn and then just turn it into ANOTHER - and much
worse - undefined behavior.

In other words, clang doesn't "sanitize" anything at all. It just
moves the mess around and makes it worse.

                    Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ubsan: disable UBSAN_DIV_ZERO for clang
  2022-07-14 20:56 [PATCH] ubsan: disable UBSAN_DIV_ZERO for clang Nick Desaulniers
  2022-07-14 21:24 ` Linus Torvalds
@ 2022-07-14 21:24 ` Nathan Chancellor
  2022-07-14 23:15 ` Linus Torvalds
  2 siblings, 0 replies; 8+ messages in thread
From: Nathan Chancellor @ 2022-07-14 21:24 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Kees Cook, Sudip Mukherjee, Linus Torvalds, Tom Rix, Marco Elver,
	Andrew Morton, Josh Poimboeuf, Peter Zijlstra (Intel),
	linux-kernel, llvm

On Thu, Jul 14, 2022 at 01:56:43PM -0700, Nick Desaulniers wrote:
> Building with UBSAN_DIV_ZERO with clang produces numerous fallthrough
> warnings from objtool.
> 
> In the case of uncheck division, UBSAN_DIV_ZERO may introduce new
> control flow to check for division by zero. Because the result of the
> division is undefined, LLVM may optimize the control flow such that
> after the call to __ubsan_handle_divrem_overflow doesn't matter. If
> panic_on_warn was set, __ubsan_handle_divrem_overflow would panic. The
> problem is is that panic_on_warn is run time configurable. If it's
> disabled, then we cannot guarantee that we will be able to recover
> safely.  Disable this config for clang until we can come up with a
> solution in LLVM.
> 
> Link: https://github.com/ClangBuiltLinux/linux/issues/1657
> Link: https://github.com/llvm/llvm-project/issues/56289
> Link: https://lore.kernel.org/lkml/CAHk-=wj1qhf7y3VNACEexyp5EbkNpdcu_542k-xZpzmYLOjiCg@mail.gmail.com/
> Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

Acked-by: Nathan Chancellor <nathan@kernel.org>

> ---
> Linus,
> I still think we should add explicit checks to gaurd against divide by
> zero.
> 
>  lib/Kconfig.ubsan | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/lib/Kconfig.ubsan b/lib/Kconfig.ubsan
> index a9f7eb047768..fd15230a703b 100644
> --- a/lib/Kconfig.ubsan
> +++ b/lib/Kconfig.ubsan
> @@ -84,6 +84,9 @@ config UBSAN_SHIFT
>  config UBSAN_DIV_ZERO
>  	bool "Perform checking for integer divide-by-zero"
>  	depends on $(cc-option,-fsanitize=integer-divide-by-zero)
> +	# https://github.com/ClangBuiltLinux/linux/issues/1657
> +	# https://github.com/llvm/llvm-project/issues/56289
> +	depends on !CC_IS_CLANG
>  	help
>  	  This option enables -fsanitize=integer-divide-by-zero which checks
>  	  for integer division by zero. This is effectively redundant with the
> -- 
> 2.37.0.170.g444d1eabd0-goog
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ubsan: disable UBSAN_DIV_ZERO for clang
  2022-07-14 21:24 ` Linus Torvalds
@ 2022-07-14 21:38   ` Nick Desaulniers
  2022-07-14 21:48     ` Linus Torvalds
  0 siblings, 1 reply; 8+ messages in thread
From: Nick Desaulniers @ 2022-07-14 21:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, Sudip Mukherjee, Nathan Chancellor, Tom Rix,
	Marco Elver, Andrew Morton, Josh Poimboeuf,
	Peter Zijlstra (Intel),
	Linux Kernel Mailing List, clang-built-linux,
	Alexander Potapenko

On Thu, Jul 14, 2022 at 2:25 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> The way you do that is by warning, and giving it defined behavior. It
> really is that simple.

int do_div (int dividend, int divisor) {
  return dividend / divisor;
}

has UB should divisor ever be zero, not much different from:

int deref (int *foo) {
  return *foo;
}

when foo is NULL.  Should the two of those be:

int do_div (int dividend, int divisor) {
  if (!divisor)
    return -EOOPS;
  return dividend / divisor;
}
int deref (int *foo) {
  if (!foo)
    return -EOOPS;
  return *foo;
}

or keep the unchecked versions and wait for a report from a user or
bot with a sanitizer splat?

I get the sanitizer doesn't work as advertised. I _agree_ with you.
Hence this patch (which I _think_ works towards your point, shouldn't
you Ack it?).  I feel like you're talking past me without addressing
my point, let me try rephrasing it:

I _additionally_ think we should be adding more checks to guard
against division by zero to the kernel sources.  Or are we happy to
wait and find out if divisors are ever zero and fix them as they pop
up/become problematic?
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ubsan: disable UBSAN_DIV_ZERO for clang
  2022-07-14 21:38   ` Nick Desaulniers
@ 2022-07-14 21:48     ` Linus Torvalds
  0 siblings, 0 replies; 8+ messages in thread
From: Linus Torvalds @ 2022-07-14 21:48 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Kees Cook, Sudip Mukherjee, Nathan Chancellor, Tom Rix,
	Marco Elver, Andrew Morton, Josh Poimboeuf,
	Peter Zijlstra (Intel),
	Linux Kernel Mailing List, clang-built-linux,
	Alexander Potapenko

On Thu, Jul 14, 2022 at 2:38 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> int do_div (int dividend, int divisor) {
>   return dividend / divisor;
> }
>
> has UB should divisor ever be zero, not much different from:

So?

What about the million other '/' in the kernel?

Adding one check to 'do_div()' is just stupid. It's like using a
bottle-cap as an umbrella.

Nick, that's the whole _point_ of having compiler support for things
like this - automation. Because doing them manually one at a time is
just completely broken and stupid.

                Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ubsan: disable UBSAN_DIV_ZERO for clang
  2022-07-14 20:56 [PATCH] ubsan: disable UBSAN_DIV_ZERO for clang Nick Desaulniers
  2022-07-14 21:24 ` Linus Torvalds
  2022-07-14 21:24 ` Nathan Chancellor
@ 2022-07-14 23:15 ` Linus Torvalds
  2022-07-16 17:34   ` Linus Torvalds
  2 siblings, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2022-07-14 23:15 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Kees Cook, Sudip Mukherjee, Nathan Chancellor, Tom Rix,
	Marco Elver, Andrew Morton, Josh Poimboeuf,
	Peter Zijlstra (Intel),
	Linux Kernel Mailing List, clang-built-linux

On Thu, Jul 14, 2022 at 1:56 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> Building with UBSAN_DIV_ZERO with clang produces numerous fallthrough
> warnings from objtool.

Ok, with this applied, things are better.

There are still the "__ubsan_handle_load_invalid_value() with UACCESS
enabled" messages, but those are misfeatures of the kvm cmpxchg
implementation.

I'm not entirely sure why the clang build warns but gcc doesn't, but I
*think* it's because clang is just being silly. It *looks* like it
checks that a "bool" has a value range of 0/1, and will complain if
not.

And the reason I say that's silly is that if I read it correctly, then
that value has literally been generated by clang itself, using "setz"
instruction.

It's the __try_cmpxchg_user_asm() macro, and with clang-14 I have it's
that CONFIG_CC_HAS_ASM_GOTO_TIED_OUTPUT case, and the C code uses
inline asm and does

        asm_volatile_goto("\n"                                          \
                     "1: " LOCK_PREFIX "cmpxchg"itype" %[new], %[ptr]\n"\
                     _ASM_EXTABLE_UA(1b, %l[label])                     \
                     : CC_OUT(z) (success),                             \

where that CC_OUT() in this case turns into

   # define CC_OUT(c) "=@cc" #c

and clang generates this code for it:

   7d01e:       f0 48 0f b1 4d 00       lock cmpxchg %rcx,0x0(%rbp)
   7d024:       49 89 c5                mov    %rax,%r13
   7d027:       0f 94 c0                sete   %al
   7d02a:       41 88 c6                mov    %al,%r14b
   7d02d:       bf 02 00 00 00          mov    $0x2,%edi
   7d032:       44 89 f6                mov    %r14d,%esi
   7d035:       e8 00 00 00 00          call    __sanitizer_cov_trace_const_cmp1
   7d03a:       41 80 fe 01             cmp    $0x1,%r14b
   7d03e:       0f 87 af 01 00 00       ja     7d1f3
<emulator_cmpxchg_emulated+0x6b3>

where that last "ja     7d1f3" is the branch to the code that then
calls __ubsan_handle_load_invalid_value.

But look at that code: it's literally

    sete   %al
    mov    %al,%r14b
    cmp    $0x1,%r14b

where clang has generated that "sete itself, and then it verifies that
the result is "<= 1".

IOW, clang seems to be literally just checking that the "sete"
instruction works right.

That's silly.

Maybe I'm misreading this, but I think the reason the clang build
complains, but the gcc build does not, is simply because gcc isn't
doing crazy checks of how the CPU works.

Some mis-feature of the "asm with flag output" code, where clang
doesn't understand that it generated that code itself, and the "setcc"
instruction always returns 0/1?

The old issue with "memcpy/memset() leaves .noinstr.text section"
because clang has generated out-of-line functions for trivial copies
also remains, but whatever.

             Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ubsan: disable UBSAN_DIV_ZERO for clang
  2022-07-14 23:15 ` Linus Torvalds
@ 2022-07-16 17:34   ` Linus Torvalds
  2022-07-21 14:48     ` Nick Desaulniers
  0 siblings, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2022-07-16 17:34 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Kees Cook, Sudip Mukherjee, Nathan Chancellor, Tom Rix,
	Marco Elver, Andrew Morton, Josh Poimboeuf,
	Peter Zijlstra (Intel),
	Linux Kernel Mailing List, clang-built-linux

On Thu, Jul 14, 2022 at 4:15 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> There are still the "__ubsan_handle_load_invalid_value() with UACCESS
> enabled" messages, but those are misfeatures of the kvm cmpxchg
> implementation.

Ok, so it really is independent of the non-optimal kvm cmpxchg implementation.

The kvm code is fine - yes, it runs a bit too much code under the
CLAC/STAC pair, but that "too much code" really is just a few local
variable assignments.

So any reasonable compiler will keep them in registers, and even if
that's not the case, it would just be a stack spill or reload with AC
set, which is not really a problem.

But calling out to external functions within that SMAP region is
invalid, and objdump warns about it.  Because we really want to
minimize the area where user accesses are ok, and it's a "don't do
this, you now lost the SMAP protection".

ANYWAY.

The reason the clang build warns - but the gcc build does not - is
simply that clang code generation is just nasty.

I decided to make a non-kernel test-case so that clang people can look
at it without having to worry about any kernel code issues or the
details of the SMAP rules, and it really shows how clang generates
horribly pointless and wrong code.

I'm not sure what the right thing to do is to get this sorted out, but
I created an "issue" on github for this:

    https://github.com/llvm/llvm-project/issues/56568

in the hope that this can get fixed. Because it's very clearly a clang
misfeature, where clang basically generates insane code that violates
the rules we try to enforce with objdump.

Nick / Nathan / clang-built-linux people - if there are other better
ways than that github issue thing to make people aware of this, that
would be lovely, and please forward that issue to the right people.

                         Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ubsan: disable UBSAN_DIV_ZERO for clang
  2022-07-16 17:34   ` Linus Torvalds
@ 2022-07-21 14:48     ` Nick Desaulniers
  0 siblings, 0 replies; 8+ messages in thread
From: Nick Desaulniers @ 2022-07-21 14:48 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, Sudip Mukherjee, Nathan Chancellor, Tom Rix,
	Marco Elver, Andrew Morton, Josh Poimboeuf,
	Peter Zijlstra (Intel),
	Linux Kernel Mailing List, clang-built-linux

On Sat, Jul 16, 2022 at 10:34 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> I decided to make a non-kernel test-case so that clang people can look
> at it without having to worry about any kernel code issues or the
> details of the SMAP rules, and it really shows how clang generates
> horribly pointless and wrong code.

Perfect, thank you.  I was having a hard time reducing this one.

>
> I'm not sure what the right thing to do is to get this sorted out, but
> I created an "issue" on github for this:
>
>     https://github.com/llvm/llvm-project/issues/56568
>
> in the hope that this can get fixed. Because it's very clearly a clang
> misfeature, where clang basically generates insane code that violates
> the rules we try to enforce with objdump.
>
> Nick / Nathan / clang-built-linux people - if there are other better
> ways than that github issue thing to make people aware of this, that
> would be lovely, and please forward that issue to the right people.

Perfect, thank you.

LLVM moved from bugzilla to github for issue tracking last year.  It's
best to file issues there, or in our issue tracker (also github) which
is more specific to issues pertaining to the kernel:
https://github.com/ClangBuiltLinux/linux/issues

You can cc Nathan or myself in bugreports via mentioning @nathanchance
and @nickdesaulniers explicitly, or just forward us the link and we
can subscribe.

Putting out some other fires/regressions at the moment; it looks like
Yuanfang has already started chewing on your bug report.  If they get
stuck, I'll dive in once these regressions are sorted.  Should be an
optimization opportunity regardless of UBSAN when we know only certain
bits are set.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-07-21 14:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-14 20:56 [PATCH] ubsan: disable UBSAN_DIV_ZERO for clang Nick Desaulniers
2022-07-14 21:24 ` Linus Torvalds
2022-07-14 21:38   ` Nick Desaulniers
2022-07-14 21:48     ` Linus Torvalds
2022-07-14 21:24 ` Nathan Chancellor
2022-07-14 23:15 ` Linus Torvalds
2022-07-16 17:34   ` Linus Torvalds
2022-07-21 14:48     ` Nick Desaulniers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).