All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] perf/x86/intel: Mark expected switch fall-throughs
@ 2019-06-24 16:19 Gustavo A. R. Silva
  2019-06-24 19:31 ` Peter Zijlstra
  2019-07-25 16:27 ` [tip:perf/urgent] " tip-bot for Gustavo A. R. Silva
  0 siblings, 2 replies; 52+ messages in thread
From: Gustavo A. R. Silva @ 2019-06-24 16:19 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Thomas Gleixner,
	Borislav Petkov, H. Peter Anvin, x86, Kan Liang
  Cc: linux-kernel, Gustavo A. R. Silva, Kees Cook

In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.

This patch fixes the following warnings:

arch/x86/events/intel/core.c: In function ‘intel_pmu_init’:
arch/x86/events/intel/core.c:4959:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
   pmem = true;
   ~~~~~^~~~~~
arch/x86/events/intel/core.c:4960:2: note: here
  case INTEL_FAM6_SKYLAKE_MOBILE:
  ^~~~
arch/x86/events/intel/core.c:5008:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
   pmem = true;
   ~~~~~^~~~~~
arch/x86/events/intel/core.c:5009:2: note: here
  case INTEL_FAM6_ICELAKE_MOBILE:
  ^~~~

Warning level 3 was used: -Wimplicit-fallthrough=3

This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
---
 arch/x86/events/intel/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index bda450ff51ee..6154f052aa5f 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4957,6 +4957,7 @@ __init int intel_pmu_init(void)
 
 	case INTEL_FAM6_SKYLAKE_X:
 		pmem = true;
+		/* fall through */
 	case INTEL_FAM6_SKYLAKE_MOBILE:
 	case INTEL_FAM6_SKYLAKE_DESKTOP:
 	case INTEL_FAM6_KABYLAKE_MOBILE:
@@ -5006,6 +5007,7 @@ __init int intel_pmu_init(void)
 	case INTEL_FAM6_ICELAKE_X:
 	case INTEL_FAM6_ICELAKE_XEON_D:
 		pmem = true;
+		/* fall through */
 	case INTEL_FAM6_ICELAKE_MOBILE:
 	case INTEL_FAM6_ICELAKE_DESKTOP:
 		x86_pmu.late_ack = true;
-- 
2.21.0


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-24 16:19 [PATCH] perf/x86/intel: Mark expected switch fall-throughs Gustavo A. R. Silva
@ 2019-06-24 19:31 ` Peter Zijlstra
  2019-06-24 19:45   ` Joe Perches
  2019-07-25 16:27 ` [tip:perf/urgent] " tip-bot for Gustavo A. R. Silva
  1 sibling, 1 reply; 52+ messages in thread
From: Peter Zijlstra @ 2019-06-24 19:31 UTC (permalink / raw)
  To: Gustavo A. R. Silva
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, x86, Kan Liang, linux-kernel, Kees Cook

On Mon, Jun 24, 2019 at 11:19:13AM -0500, Gustavo A. R. Silva wrote:
> In preparation to enabling -Wimplicit-fallthrough, mark switch
> cases where we are expecting to fall through.
> 
> This patch fixes the following warnings:
> 
> arch/x86/events/intel/core.c: In function ‘intel_pmu_init’:
> arch/x86/events/intel/core.c:4959:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
>    pmem = true;
>    ~~~~~^~~~~~
> arch/x86/events/intel/core.c:4960:2: note: here
>   case INTEL_FAM6_SKYLAKE_MOBILE:
>   ^~~~
> arch/x86/events/intel/core.c:5008:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
>    pmem = true;
>    ~~~~~^~~~~~
> arch/x86/events/intel/core.c:5009:2: note: here
>   case INTEL_FAM6_ICELAKE_MOBILE:
>   ^~~~
> 
> Warning level 3 was used: -Wimplicit-fallthrough=3
> 
> This patch is part of the ongoing efforts to enable
> -Wimplicit-fallthrough.
> 
> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

I still consider it an abomination that the C parser looks at comments
-- other than to delete them, but OK I suppose, I'll take it.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-24 19:31 ` Peter Zijlstra
@ 2019-06-24 19:45   ` Joe Perches
  2019-06-24 20:37     ` Peter Zijlstra
  0 siblings, 1 reply; 52+ messages in thread
From: Joe Perches @ 2019-06-24 19:45 UTC (permalink / raw)
  To: Peter Zijlstra, Gustavo A. R. Silva
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, x86, Kan Liang, linux-kernel, Kees Cook,
	Miguel Ojeda, Shawn Landden

On Mon, 2019-06-24 at 21:31 +0200, Peter Zijlstra wrote:
> On Mon, Jun 24, 2019 at 11:19:13AM -0500, Gustavo A. R. Silva wrote:
> > In preparation to enabling -Wimplicit-fallthrough, mark switch
> > cases where we are expecting to fall through.
> > 
> > This patch fixes the following warnings:
> > 
> > arch/x86/events/intel/core.c: In function ‘intel_pmu_init’:
> > arch/x86/events/intel/core.c:4959:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
> >    pmem = true;
> >    ~~~~~^~~~~~
> > arch/x86/events/intel/core.c:4960:2: note: here
> >   case INTEL_FAM6_SKYLAKE_MOBILE:
> >   ^~~~
> > arch/x86/events/intel/core.c:5008:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
> >    pmem = true;
> >    ~~~~~^~~~~~
> > arch/x86/events/intel/core.c:5009:2: note: here
> >   case INTEL_FAM6_ICELAKE_MOBILE:
> >   ^~~~
> > 
> > Warning level 3 was used: -Wimplicit-fallthrough=3
> > 
> > This patch is part of the ongoing efforts to enable
> > -Wimplicit-fallthrough.
> > 
> > Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
> 
> I still consider it an abomination that the C parser looks at comments
> -- other than to delete them, but OK I suppose, I'll take it.

I still believe Arnaldo's/Miguel's/Shawn's/my et al. suggestion of

#define __fallthrough __attribute__((fallthrough))

is far better.

https://lkml.org/lkml/2017/2/9/845
https://lkml.org/lkml/2017/2/10/485
https://lore.kernel.org/lkml/20181021171414.22674-2-miguel.ojeda.sandonis@gmail.com/
https://lore.kernel.org/lkml/20190617155643.GA32544@amd/




^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-24 19:45   ` Joe Perches
@ 2019-06-24 20:37     ` Peter Zijlstra
  2019-06-24 20:53       ` Gustavo A. R. Silva
  0 siblings, 1 reply; 52+ messages in thread
From: Peter Zijlstra @ 2019-06-24 20:37 UTC (permalink / raw)
  To: Joe Perches
  Cc: Gustavo A. R. Silva, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Thomas Gleixner,
	Borislav Petkov, H. Peter Anvin, x86, Kan Liang, linux-kernel,
	Kees Cook, Miguel Ojeda, Shawn Landden

On Mon, Jun 24, 2019 at 12:45:54PM -0700, Joe Perches wrote:
> On Mon, 2019-06-24 at 21:31 +0200, Peter Zijlstra wrote:
> > On Mon, Jun 24, 2019 at 11:19:13AM -0500, Gustavo A. R. Silva wrote:
> > > In preparation to enabling -Wimplicit-fallthrough, mark switch
> > > cases where we are expecting to fall through.
> > > 
> > > This patch fixes the following warnings:
> > > 
> > > arch/x86/events/intel/core.c: In function ‘intel_pmu_init’:
> > > arch/x86/events/intel/core.c:4959:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
> > >    pmem = true;
> > >    ~~~~~^~~~~~
> > > arch/x86/events/intel/core.c:4960:2: note: here
> > >   case INTEL_FAM6_SKYLAKE_MOBILE:
> > >   ^~~~
> > > arch/x86/events/intel/core.c:5008:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
> > >    pmem = true;
> > >    ~~~~~^~~~~~
> > > arch/x86/events/intel/core.c:5009:2: note: here
> > >   case INTEL_FAM6_ICELAKE_MOBILE:
> > >   ^~~~
> > > 
> > > Warning level 3 was used: -Wimplicit-fallthrough=3
> > > 
> > > This patch is part of the ongoing efforts to enable
> > > -Wimplicit-fallthrough.
> > > 
> > > Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
> > 
> > I still consider it an abomination that the C parser looks at comments
> > -- other than to delete them, but OK I suppose, I'll take it.
> 
> I still believe Arnaldo's/Miguel's/Shawn's/my et al. suggestion of
> 
> #define __fallthrough __attribute__((fallthrough))
> 
> is far better.
> 
> https://lkml.org/lkml/2017/2/9/845
> https://lkml.org/lkml/2017/2/10/485
> https://lore.kernel.org/lkml/20181021171414.22674-2-miguel.ojeda.sandonis@gmail.com/
> https://lore.kernel.org/lkml/20190617155643.GA32544@amd/

Oh yes, worlds better. Please, can we haz that instead?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-24 20:37     ` Peter Zijlstra
@ 2019-06-24 20:53       ` Gustavo A. R. Silva
  2019-06-24 20:57         ` Joe Perches
                           ` (2 more replies)
  0 siblings, 3 replies; 52+ messages in thread
From: Gustavo A. R. Silva @ 2019-06-24 20:53 UTC (permalink / raw)
  To: Peter Zijlstra, Joe Perches
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, x86, Kan Liang, linux-kernel, Kees Cook,
	Miguel Ojeda, Shawn Landden



On 6/24/19 3:37 PM, Peter Zijlstra wrote:
> On Mon, Jun 24, 2019 at 12:45:54PM -0700, Joe Perches wrote:
>> On Mon, 2019-06-24 at 21:31 +0200, Peter Zijlstra wrote:
>>> On Mon, Jun 24, 2019 at 11:19:13AM -0500, Gustavo A. R. Silva wrote:
>>>> In preparation to enabling -Wimplicit-fallthrough, mark switch
>>>> cases where we are expecting to fall through.
>>>>
>>>> This patch fixes the following warnings:
>>>>
>>>> arch/x86/events/intel/core.c: In function ‘intel_pmu_init’:
>>>> arch/x86/events/intel/core.c:4959:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
>>>>    pmem = true;
>>>>    ~~~~~^~~~~~
>>>> arch/x86/events/intel/core.c:4960:2: note: here
>>>>   case INTEL_FAM6_SKYLAKE_MOBILE:
>>>>   ^~~~
>>>> arch/x86/events/intel/core.c:5008:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
>>>>    pmem = true;
>>>>    ~~~~~^~~~~~
>>>> arch/x86/events/intel/core.c:5009:2: note: here
>>>>   case INTEL_FAM6_ICELAKE_MOBILE:
>>>>   ^~~~
>>>>
>>>> Warning level 3 was used: -Wimplicit-fallthrough=3
>>>>
>>>> This patch is part of the ongoing efforts to enable
>>>> -Wimplicit-fallthrough.
>>>>
>>>> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
>>>
>>> I still consider it an abomination that the C parser looks at comments
>>> -- other than to delete them, but OK I suppose, I'll take it.
>>
>> I still believe Arnaldo's/Miguel's/Shawn's/my et al. suggestion of
>>
>> #define __fallthrough __attribute__((fallthrough))
>>
>> is far better.
>>
>> https://lkml.org/lkml/2017/2/9/845
>> https://lkml.org/lkml/2017/2/10/485
>> https://lore.kernel.org/lkml/20181021171414.22674-2-miguel.ojeda.sandonis@gmail.com/
>> https://lore.kernel.org/lkml/20190617155643.GA32544@amd/
> 
> Oh yes, worlds better. Please, can we haz that instead?
> 

Once the C++17 `__attribute__((fallthrough))` is more widely handled by C compilers,
static analyzers, and IDEs, we can switch to using that instead. Also, we are a few
warnings away (less than five) from being able to enable -Wimplicit-fallthrough. After
this option has been finally enabled (in v5.3) we can easily go and replace the comments
to whatever we agree upon.

Thanks
--
Gustavo


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-24 20:53       ` Gustavo A. R. Silva
@ 2019-06-24 20:57         ` Joe Perches
  2019-06-25  7:20           ` Peter Zijlstra
  2019-06-24 22:28         ` Miguel Ojeda
  2019-06-25  7:15         ` Peter Zijlstra
  2 siblings, 1 reply; 52+ messages in thread
From: Joe Perches @ 2019-06-24 20:57 UTC (permalink / raw)
  To: Gustavo A. R. Silva, Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, x86, Kan Liang, linux-kernel, Kees Cook,
	Miguel Ojeda, Shawn Landden

On Mon, 2019-06-24 at 15:53 -0500, Gustavo A. R. Silva wrote:
> On 6/24/19 3:37 PM, Peter Zijlstra wrote:
> > On Mon, Jun 24, 2019 at 12:45:54PM -0700, Joe Perches wrote:
> > > On Mon, 2019-06-24 at 21:31 +0200, Peter Zijlstra wrote:
> > > > I still consider it an abomination that the C parser looks at comments
> > > > -- other than to delete them, but OK I suppose, I'll take it.
> > > I still believe Arnaldo's/Miguel's/Shawn's/my et al. suggestion of
> > > #define __fallthrough __attribute__((fallthrough))
> > > is far better.
> > Oh yes, worlds better. Please, can we haz that instead?
> Once the C++17 `__attribute__((fallthrough))` is more widely handled by C compilers,
> static analyzers, and IDEs, we can switch to using that instead.
> Also, we are a few
> warnings away (less than five) from being able to enable -Wimplicit-fallthrough. After
> this option has been finally enabled (in v5.3) we can easily go and replace the comments
> to whatever we agree upon.

I doubt waiting is better.
If the latest compilers catch it, it's
probably good enough.

fallthrough or __fallthrough.  I don't care which.

I also doubt most static analyzers will parse all
#include headers to find the #define.



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-24 20:53       ` Gustavo A. R. Silva
  2019-06-24 20:57         ` Joe Perches
@ 2019-06-24 22:28         ` Miguel Ojeda
  2019-06-25  7:18           ` Peter Zijlstra
  2019-06-25  7:15         ` Peter Zijlstra
  2 siblings, 1 reply; 52+ messages in thread
From: Miguel Ojeda @ 2019-06-24 22:28 UTC (permalink / raw)
  To: Gustavo A. R. Silva
  Cc: Peter Zijlstra, Joe Perches, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Kees Cook, Shawn Landden

On Mon, Jun 24, 2019 at 10:53 PM Gustavo A. R. Silva
<gustavo@embeddedor.com> wrote:
>
> Once the C++17 `__attribute__((fallthrough))` is more widely handled by C compilers,
> static analyzers, and IDEs, we can switch to using that instead. Also, we are a few
> warnings away (less than five) from being able to enable -Wimplicit-fallthrough. After
> this option has been finally enabled (in v5.3) we can easily go and replace the comments
> to whatever we agree upon.

Indeed -- the decision last year was to wait for a while since not
everyone had support for it. My branch is waiting here:

    https://github.com/ojeda/linux/tree/compiler-attributes-fallthrough

The good news is that there is some progress. For instance, LLVM is
working on supporting the GNU spelling:

    https://reviews.llvm.org/D63260
    https://bugs.llvm.org/show_bug.cgi?id=37135
    https://github.com/ClangBuiltLinux/linux/issues/235

Also note that C2x may get [[fallthrough]]. See N2267 and N2335. At
that point, surely tools/IDEs/analyzers will support it :-) The
question is whether we want to wait that long to replace the comments.

On related news, we also may get __has_c_attribute() standardized
(i.e. we use __has_attribute() now), too, see N2333.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-24 20:53       ` Gustavo A. R. Silva
  2019-06-24 20:57         ` Joe Perches
  2019-06-24 22:28         ` Miguel Ojeda
@ 2019-06-25  7:15         ` Peter Zijlstra
  2 siblings, 0 replies; 52+ messages in thread
From: Peter Zijlstra @ 2019-06-25  7:15 UTC (permalink / raw)
  To: Gustavo A. R. Silva
  Cc: Joe Perches, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Thomas Gleixner,
	Borislav Petkov, H. Peter Anvin, x86, Kan Liang, linux-kernel,
	Kees Cook, Miguel Ojeda, Shawn Landden

On Mon, Jun 24, 2019 at 03:53:04PM -0500, Gustavo A. R. Silva wrote:
> Once the C++17 `__attribute__((fallthrough))` is more widely handled by C compilers,

From what I read that attribute landed in the exact same GCC version as
the warning. And last I checked clang wasn't there yet anyway.

> static analyzers, and IDEs, we can switch to using that instead. Also, we are a few

I don't give a crap about lousy IDEs. And coverity already supports the
attribute and other checkers are open-source and can be easily fixed or
ignored.

> warnings away (less than five) from being able to enable -Wimplicit-fallthrough. After
> this option has been finally enabled (in v5.3) we can easily go and replace the comments
> to whatever we agree upon.

Feh. Still an abomination.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-24 22:28         ` Miguel Ojeda
@ 2019-06-25  7:18           ` Peter Zijlstra
  2019-06-25 12:47             ` Miguel Ojeda
  2019-06-25 17:12             ` Kees Cook
  0 siblings, 2 replies; 52+ messages in thread
From: Peter Zijlstra @ 2019-06-25  7:18 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Gustavo A. R. Silva, Joe Perches, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Kees Cook, Shawn Landden

On Tue, Jun 25, 2019 at 12:28:23AM +0200, Miguel Ojeda wrote:
> On Mon, Jun 24, 2019 at 10:53 PM Gustavo A. R. Silva
> <gustavo@embeddedor.com> wrote:
> >
> > Once the C++17 `__attribute__((fallthrough))` is more widely handled by C compilers,
> > static analyzers, and IDEs, we can switch to using that instead. Also, we are a few
> > warnings away (less than five) from being able to enable -Wimplicit-fallthrough. After
> > this option has been finally enabled (in v5.3) we can easily go and replace the comments
> > to whatever we agree upon.
> 
> Indeed -- the decision last year was to wait for a while since not
> everyone had support for it. My branch is waiting here:
> 
>     https://github.com/ojeda/linux/tree/compiler-attributes-fallthrough
> 
> The good news is that there is some progress. For instance, LLVM is
> working on supporting the GNU spelling:
> 
>     https://reviews.llvm.org/D63260
>     https://bugs.llvm.org/show_bug.cgi?id=37135
>     https://github.com/ClangBuiltLinux/linux/issues/235

Can it build a kernel without patches yet? That is, why should I care
what LLVM does?

> Also note that C2x may get [[fallthrough]]. See N2267 and N2335. At
> that point, surely tools/IDEs/analyzers will support it :-) The
> question is whether we want to wait that long to replace the comments.

#define __fallthrough [[fallthrough]]

right?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-24 20:57         ` Joe Perches
@ 2019-06-25  7:20           ` Peter Zijlstra
  0 siblings, 0 replies; 52+ messages in thread
From: Peter Zijlstra @ 2019-06-25  7:20 UTC (permalink / raw)
  To: Joe Perches
  Cc: Gustavo A. R. Silva, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Thomas Gleixner,
	Borislav Petkov, H. Peter Anvin, x86, Kan Liang, linux-kernel,
	Kees Cook, Miguel Ojeda, Shawn Landden

On Mon, Jun 24, 2019 at 01:57:49PM -0700, Joe Perches wrote:

> > Once the C++17 `__attribute__((fallthrough))` is more widely handled by C compilers,

> I doubt waiting is better.
> If the latest compilers catch it, it's
> probably good enough.

Yeah, I don't see the point either; GCC does it, and that's all I really
care about.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25  7:18           ` Peter Zijlstra
@ 2019-06-25 12:47             ` Miguel Ojeda
  2019-06-25 18:15               ` Nick Desaulniers
  2019-06-25 17:12             ` Kees Cook
  1 sibling, 1 reply; 52+ messages in thread
From: Miguel Ojeda @ 2019-06-25 12:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Gustavo A. R. Silva, Joe Perches, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Kees Cook, Shawn Landden,
	Nick Desaulniers, Nathan Chancellor, Luc Van Oostenryck

On Tue, Jun 25, 2019 at 9:19 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> Can it build a kernel without patches yet? That is, why should I care
> what LLVM does?

Having more than a single compiler is always a good idea. You benefit
from more warnings, more tooling, a second implementation for
reference/comparison, etc.

As for what is the current state, I think they are close, specially
for aarch64, but I let Nick, Nathan et. al. answer that! :-) (Cc'd).
They had a talk in FOSDEM 2019 about it, too.

Also CC'ing Luc since he changed sparse to stop ignoring the attribute
so that __has_attribute() would work, but I am not sure if there has
been further work on supporting it properly.

> > Also note that C2x may get [[fallthrough]]. See N2267 and N2335. At
> > that point, surely tools/IDEs/analyzers will support it :-) The
> > question is whether we want to wait that long to replace the comments.
>
> #define __fallthrough [[fallthrough]]
>
> right?

Yes and no. The exact spelling we use does not matter much. My point
with that paragraph was that since C2x will (maybe) add fallthrough,
as C++17 did, every compiler/analyzer/IDE/etc. that is still missing
support for it will have to eventually add it even if they ignore GNU
attributes. At that point, I would guess most will likely add all
spellings too.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25  7:18           ` Peter Zijlstra
  2019-06-25 12:47             ` Miguel Ojeda
@ 2019-06-25 17:12             ` Kees Cook
  2019-06-25 18:05               ` Nathan Chancellor
  2019-06-26  8:06               ` Peter Zijlstra
  1 sibling, 2 replies; 52+ messages in thread
From: Kees Cook @ 2019-06-25 17:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Miguel Ojeda, Gustavo A. R. Silva, Joe Perches, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden

On Tue, Jun 25, 2019 at 09:18:46AM +0200, Peter Zijlstra wrote:
> Can it build a kernel without patches yet? That is, why should I care
> what LLVM does?

Yes. LLVM trunk builds and boots x86 now. As for distro availability,
AIUI, the asm-goto feature missed the 9.0 LLVM branch point, so it'll
appear in the following release.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25 17:12             ` Kees Cook
@ 2019-06-25 18:05               ` Nathan Chancellor
  2019-06-25 19:53                 ` Thomas Gleixner
  2019-06-25 20:09                 ` Kees Cook
  2019-06-26  8:06               ` Peter Zijlstra
  1 sibling, 2 replies; 52+ messages in thread
From: Nathan Chancellor @ 2019-06-25 18:05 UTC (permalink / raw)
  To: Kees Cook
  Cc: Peter Zijlstra, Miguel Ojeda, Gustavo A. R. Silva, Joe Perches,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden

On Tue, Jun 25, 2019 at 10:12:42AM -0700, Kees Cook wrote:
> On Tue, Jun 25, 2019 at 09:18:46AM +0200, Peter Zijlstra wrote:
> > Can it build a kernel without patches yet? That is, why should I care
> > what LLVM does?
> 
> Yes. LLVM trunk builds and boots x86 now. As for distro availability,
> AIUI, the asm-goto feature missed the 9.0 LLVM branch point, so it'll
> appear in the following release.
> 
> -- 
> Kees Cook

I don't think that's right. LLVM 9 hasn't been branched yet so it should
make it in.

http://lists.llvm.org/pipermail/llvm-dev/2019-June/133155.html

If anyone wants to play around with it before then, we wrote a
self-contained script that will build an LLVM toolchain suitable for
kernel development:

https://github.com/ClangBuiltLinux/tc-build

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25 12:47             ` Miguel Ojeda
@ 2019-06-25 18:15               ` Nick Desaulniers
  2019-06-25 22:29                 ` Joe Perches
  2019-06-26  8:49                 ` Peter Zijlstra
  0 siblings, 2 replies; 52+ messages in thread
From: Nick Desaulniers @ 2019-06-25 18:15 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Gustavo A. R. Silva, Joe Perches, Miguel Ojeda, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Kees Cook, Shawn Landden,
	Nathan Chancellor, Luc Van Oostenryck, Chandler Carruth

On Tue, Jun 25, 2019 at 5:47 AM Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
>
> On Tue, Jun 25, 2019 at 9:19 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > Can it build a kernel without patches yet? That is, why should I care
> > what LLVM does?

Unreleased versions of Clang built from source can; the latest release
of Clang-8 doesn't have asm goto support required for
CONFIG_JUMP_LABEL.  Things can get complicated based on which kernel
tree/branch (mainline, -next, stable), arch, and configs, but I think
we just have a few long tail bugs left.

We're currently planning multiple output constraint support w/ asm
goto, and have recently implemented things like
__GCC_ASM_FLAG_OUTPUTS__.  If there's other features that we should
start implementing, please let us know.

Give it a shot and let us know what works or doesn't.

For more info, see our
site: https://clangbuiltlinux.github.io/
CI: https://travis-ci.com/ClangBuiltLinux/continuous-integration
Bug tracker: https://github.com/ClangBuiltLinux/linux/issues
wiki: https://github.com/ClangBuiltLinux/linux/wiki

Or reach out to us via
email: clang-built-linux@googlegroups.com
irc: #clangbuiltlinux on chat.freenode.net
or attend our public bi-weekly (once ever 2 weeks, not twice a week) meeting

>
> Having more than a single compiler is always a good idea. You benefit
> from more warnings, more tooling, a second implementation for
> reference/comparison, etc.
>
> As for what is the current state, I think they are close, specially
> for aarch64, but I let Nick, Nathan et. al. answer that! :-) (Cc'd).
> They had a talk in FOSDEM 2019 about it, too.
>
> Also CC'ing Luc since he changed sparse to stop ignoring the attribute
> so that __has_attribute() would work, but I am not sure if there has
> been further work on supporting it properly.
>
> > > Also note that C2x may get [[fallthrough]]. See N2267 and N2335. At
> > > that point, surely tools/IDEs/analyzers will support it :-) The
> > > question is whether we want to wait that long to replace the comments.
> >
> > #define __fallthrough [[fallthrough]]
> >
> > right?
>
> Yes and no. The exact spelling we use does not matter much. My point
> with that paragraph was that since C2x will (maybe) add fallthrough,
> as C++17 did, every compiler/analyzer/IDE/etc. that is still missing
> support for it will have to eventually add it even if they ignore GNU
> attributes. At that point, I would guess most will likely add all
> spellings too.

Regarding __attribute__((fallthrough)), it's currently a work in
progress in Clang:
https://bugs.llvm.org/show_bug.cgi?id=37135
https://reviews.llvm.org/D63260

-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25 18:05               ` Nathan Chancellor
@ 2019-06-25 19:53                 ` Thomas Gleixner
  2019-06-25 20:27                   ` Nathan Chancellor
  2019-06-25 20:09                 ` Kees Cook
  1 sibling, 1 reply; 52+ messages in thread
From: Thomas Gleixner @ 2019-06-25 19:53 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Kees Cook, Peter Zijlstra, Miguel Ojeda, Gustavo A. R. Silva,
	Joe Perches, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Borislav Petkov,
	H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden

On Tue, 25 Jun 2019, Nathan Chancellor wrote:
> On Tue, Jun 25, 2019 at 10:12:42AM -0700, Kees Cook wrote:
> > On Tue, Jun 25, 2019 at 09:18:46AM +0200, Peter Zijlstra wrote:
> > > Can it build a kernel without patches yet? That is, why should I care
> > > what LLVM does?
> > 
> > Yes. LLVM trunk builds and boots x86 now. As for distro availability,
> > AIUI, the asm-goto feature missed the 9.0 LLVM branch point, so it'll
> > appear in the following release.
> > 
> > -- 
> > Kees Cook
> 
> I don't think that's right. LLVM 9 hasn't been branched yet so it should
> make it in.
> 
> http://lists.llvm.org/pipermail/llvm-dev/2019-June/133155.html
> 
> If anyone wants to play around with it before then, we wrote a
> self-contained script that will build an LLVM toolchain suitable for
> kernel development:
> 
> https://github.com/ClangBuiltLinux/tc-build

Useful!

But can the script please check for a minimal clang version required to
build that thing.

The default clang-3.8 which is installed on Debian stretch explodes. The
6.0 variant from backports works as advertised.

Kernel builds with the new shiny compiler. Jump labels seem to be enabled.

It complains about a few type conversions:

 arch/x86/kvm/mmu.c:4596:39: warning: implicit conversion from 'int' to 'u8' (aka 'unsigned char') changes value from -205 to 51 [-Wconstant-conversion]
                u8 wf = (pfec & PFERR_WRITE_MASK) ? ~w : 0;
                   ~~                               ^~

but it also makes objtool unhappy:

 arch/x86/events/intel/core.o: warning: objtool: intel_pmu_nhm_workaround()+0xb3: unreachable instruction
 kernel/fork.o: warning: objtool: free_thread_stack()+0x126: unreachable instruction
 mm/workingset.o: warning: objtool: count_shadow_nodes()+0x11f: unreachable instruction
 arch/x86/kernel/cpu/mtrr/generic.o: warning: objtool: get_fixed_ranges()+0x9b: unreachable instruction
 arch/x86/kernel/platform-quirks.o: warning: objtool: x86_early_init_platform_quirks()+0x84: unreachable instruction
 drivers/iommu/irq_remapping.o: warning: objtool: irq_remap_enable_fault_handling()+0x1d: unreachable instruction

Kernel boots. As I'm currently benchmarking VDSO performance, this was
obviosly my first test. Compared to the same kernel built with gcc6.3 the
performance of the VDSO drops slightly.

It's below 1%. Though I need to run the same tests on 4 other uarchs to get
the full picture. This stuff is randomly changing behaviour accross uarchs
depending on how the c source is arranged. So nothing to worry about (yet).

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25 18:05               ` Nathan Chancellor
  2019-06-25 19:53                 ` Thomas Gleixner
@ 2019-06-25 20:09                 ` Kees Cook
  1 sibling, 0 replies; 52+ messages in thread
From: Kees Cook @ 2019-06-25 20:09 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Peter Zijlstra, Miguel Ojeda, Gustavo A. R. Silva, Joe Perches,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden

On Tue, Jun 25, 2019 at 11:05:25AM -0700, Nathan Chancellor wrote:
> On Tue, Jun 25, 2019 at 10:12:42AM -0700, Kees Cook wrote:
> > On Tue, Jun 25, 2019 at 09:18:46AM +0200, Peter Zijlstra wrote:
> > > Can it build a kernel without patches yet? That is, why should I care
> > > what LLVM does?
> > 
> > Yes. LLVM trunk builds and boots x86 now. As for distro availability,
> > AIUI, the asm-goto feature missed the 9.0 LLVM branch point, so it'll
> > appear in the following release.
> > 
> > -- 
> > Kees Cook
> 
> I don't think that's right. LLVM 9 hasn't been branched yet so it should
> make it in.
> 
> http://lists.llvm.org/pipermail/llvm-dev/2019-June/133155.html
> 
> If anyone wants to play around with it before then, we wrote a
> self-contained script that will build an LLVM toolchain suitable for
> kernel development:
> 
> https://github.com/ClangBuiltLinux/tc-build

Ah! That's good news. :) I thought the branch happened just before
asm-goto landed. Wheee!

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25 19:53                 ` Thomas Gleixner
@ 2019-06-25 20:27                   ` Nathan Chancellor
  2019-06-25 20:37                     ` Nick Desaulniers
                                       ` (2 more replies)
  0 siblings, 3 replies; 52+ messages in thread
From: Nathan Chancellor @ 2019-06-25 20:27 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Kees Cook, Peter Zijlstra, Miguel Ojeda, Gustavo A. R. Silva,
	Joe Perches, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Borislav Petkov,
	H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux

Hi Thomas,

On Tue, Jun 25, 2019 at 09:53:09PM +0200, Thomas Gleixner wrote:
> On Tue, 25 Jun 2019, Nathan Chancellor wrote:
> > On Tue, Jun 25, 2019 at 10:12:42AM -0700, Kees Cook wrote:
> > > On Tue, Jun 25, 2019 at 09:18:46AM +0200, Peter Zijlstra wrote:
> > > > Can it build a kernel without patches yet? That is, why should I care
> > > > what LLVM does?
> > > 
> > > Yes. LLVM trunk builds and boots x86 now. As for distro availability,
> > > AIUI, the asm-goto feature missed the 9.0 LLVM branch point, so it'll
> > > appear in the following release.
> > > 
> > > -- 
> > > Kees Cook
> > 
> > I don't think that's right. LLVM 9 hasn't been branched yet so it should
> > make it in.
> > 
> > http://lists.llvm.org/pipermail/llvm-dev/2019-June/133155.html
> > 
> > If anyone wants to play around with it before then, we wrote a
> > self-contained script that will build an LLVM toolchain suitable for
> > kernel development:
> > 
> > https://github.com/ClangBuiltLinux/tc-build
> 
> Useful!
> 
> But can the script please check for a minimal clang version required to
> build that thing.
> 
> The default clang-3.8 which is installed on Debian stretch explodes. The
> 6.0 variant from backports works as advertised.
> 

Hmmm interesting, I test a lot of different distros using Docker
containers to make sure the script works universally and that includes
Debian stretch, which is the stress tester because all of the packages
are older. I install the following packages then run the following
command and it works fine for me (just tested):

$ apt update && apt install -y --no-install-recommends ca-certificates \
ccache clang cmake curl file gcc g++ git make ninja-build python3 \
texinfo zlib1g-dev
$ ./build-llvm.py

If you could give me a build log, I'd be happy to look into it and see
what I can do.

> Kernel builds with the new shiny compiler. Jump labels seem to be enabled.
> 
> It complains about a few type conversions:
> 
>  arch/x86/kvm/mmu.c:4596:39: warning: implicit conversion from 'int' to 'u8' (aka 'unsigned char') changes value from -205 to 51 [-Wconstant-conversion]
>                 u8 wf = (pfec & PFERR_WRITE_MASK) ? ~w : 0;
>                    ~~                               ^~
> 

Yes, there was a patch sent to try and fix this but it was rejected by
the maintainers:

https://github.com/ClangBuiltLinux/linux/issues/95

https://lore.kernel.org/lkml/20180619192504.180479-1-mka@chromium.org/

> but it also makes objtool unhappy:
> 
>  arch/x86/events/intel/core.o: warning: objtool: intel_pmu_nhm_workaround()+0xb3: unreachable instruction
>  kernel/fork.o: warning: objtool: free_thread_stack()+0x126: unreachable instruction
>  mm/workingset.o: warning: objtool: count_shadow_nodes()+0x11f: unreachable instruction
>  arch/x86/kernel/cpu/mtrr/generic.o: warning: objtool: get_fixed_ranges()+0x9b: unreachable instruction
>  arch/x86/kernel/platform-quirks.o: warning: objtool: x86_early_init_platform_quirks()+0x84: unreachable instruction
>  drivers/iommu/irq_remapping.o: warning: objtool: irq_remap_enable_fault_handling()+0x1d: unreachable instruction
> 

Unfortunately, we have quite a few of those outstanding, it's probably
time to start really taking a look at them:

https://github.com/ClangBuiltLinux/linux/labels/objtool

> Kernel boots. As I'm currently benchmarking VDSO performance, this was
> obviosly my first test. Compared to the same kernel built with gcc6.3 the
> performance of the VDSO drops slightly.
> 
> It's below 1%. Though I need to run the same tests on 4 other uarchs to get
> the full picture. This stuff is randomly changing behaviour accross uarchs
> depending on how the c source is arranged. So nothing to worry about (yet).
> 

Thanks for trying it out and letting us know. Please keep us in the loop
if you happen to find anything amiss.

Cheers,
Nathan

> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25 20:27                   ` Nathan Chancellor
@ 2019-06-25 20:37                     ` Nick Desaulniers
  2019-06-25 21:47                     ` Thomas Gleixner
  2019-06-25 23:46                     ` Arnaldo Carvalho de Melo
  2 siblings, 0 replies; 52+ messages in thread
From: Nick Desaulniers @ 2019-06-25 20:37 UTC (permalink / raw)
  To: Thomas Gleixner, Peter Zijlstra
  Cc: Kees Cook, Miguel Ojeda, Gustavo A. R. Silva, Joe Perches,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux,
	Nathan Chancellor

On Tue, Jun 25, 2019 at 1:27 PM Nathan Chancellor
<natechancellor@gmail.com> wrote:
> On Tue, Jun 25, 2019 at 09:53:09PM +0200, Thomas Gleixner wrote:
> > On Tue, 25 Jun 2019, Nathan Chancellor wrote:
> > > On Tue, Jun 25, 2019 at 10:12:42AM -0700, Kees Cook wrote:
> > > > On Tue, Jun 25, 2019 at 09:18:46AM +0200, Peter Zijlstra wrote:
> > > > > Can it build a kernel without patches yet? That is, why should I care
> > > > > what LLVM does?
> > > >
> > > > Yes. LLVM trunk builds and boots x86 now. As for distro availability,
> > > > AIUI, the asm-goto feature missed the 9.0 LLVM branch point, so it'll
> > > > appear in the following release.

Kees meant asm goto missed 8.0 LLVM branch.  9.0 which is unreleased
will have it.

> > Kernel boots. As I'm currently benchmarking VDSO performance, this was
> > obviosly my first test. Compared to the same kernel built with gcc6.3 the
> > performance of the VDSO drops slightly.
> >
> > It's below 1%. Though I need to run the same tests on 4 other uarchs to get
> > the full picture. This stuff is randomly changing behaviour accross uarchs
> > depending on how the c source is arranged. So nothing to worry about (yet).

Thank you very much for testing and for these reports.  We've been
working through:
1. make it build
2. make it boot
3. make it run well
4. add features

With some amount of cycles in the above.  Now that most of the build
issues have been resolved, we're more able to focus our resources on 3
and 4.  Please report unexpected regressions to our mailing list
clang-built-linux@googlegroups.com and we'll track/follow up upstream
on the LLVM side.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25 20:27                   ` Nathan Chancellor
  2019-06-25 20:37                     ` Nick Desaulniers
@ 2019-06-25 21:47                     ` Thomas Gleixner
  2019-06-26  5:10                       ` Nathan Chancellor
                                         ` (2 more replies)
  2019-06-25 23:46                     ` Arnaldo Carvalho de Melo
  2 siblings, 3 replies; 52+ messages in thread
From: Thomas Gleixner @ 2019-06-25 21:47 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Kees Cook, Peter Zijlstra, Miguel Ojeda, Gustavo A. R. Silva,
	Joe Perches, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Borislav Petkov,
	H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux,
	Josh Poimboeuf

Nathan,

On Tue, 25 Jun 2019, Nathan Chancellor wrote:
> On Tue, Jun 25, 2019 at 09:53:09PM +0200, Thomas Gleixner wrote:
> > 
> > But can the script please check for a minimal clang version required to
> > build that thing.
> > 
> > The default clang-3.8 which is installed on Debian stretch explodes. The
> > 6.0 variant from backports works as advertised.
> > 
> 
> Hmmm interesting, I test a lot of different distros using Docker
> containers to make sure the script works universally and that includes
> Debian stretch, which is the stress tester because all of the packages
> are older. I install the following packages then run the following
> command and it works fine for me (just tested):
> 
> $ apt update && apt install -y --no-install-recommends ca-certificates \
> ccache clang cmake curl file gcc g++ git make ninja-build python3 \
> texinfo zlib1g-dev
> $ ./build-llvm.py
> 
> If you could give me a build log, I'd be happy to look into it and see
> what I can do.

I can produce one tomorrow.
 
> > Kernel builds with the new shiny compiler. Jump labels seem to be enabled.
> > 
> > It complains about a few type conversions:
> > 
> >  arch/x86/kvm/mmu.c:4596:39: warning: implicit conversion from 'int' to 'u8' (aka 'unsigned char') changes value from -205 to 51 [-Wconstant-conversion]
> >                 u8 wf = (pfec & PFERR_WRITE_MASK) ? ~w : 0;
> >                    ~~                               ^~
> > 
> 
> Yes, there was a patch sent to try and fix this but it was rejected by
> the maintainers:
> 
> https://github.com/ClangBuiltLinux/linux/issues/95
> 
> https://lore.kernel.org/lkml/20180619192504.180479-1-mka@chromium.org/

Just looked through it. I don't think it's an outright reject. Paolo was
not totally against it and then the whole discussion degraded into bikeshed
painting and bitching about compiler error messaged. Try again or should I?

> > but it also makes objtool unhappy:
> > 
> >  arch/x86/events/intel/core.o: warning: objtool: intel_pmu_nhm_workaround()+0xb3: unreachable
instruction
> >  kernel/fork.o: warning: objtool: free_thread_stack()+0x126: unreachable instruction
> >  mm/workingset.o: warning: objtool: count_shadow_nodes()+0x11f: unreachable instruction
> >  arch/x86/kernel/cpu/mtrr/generic.o: warning: objtool: get_fixed_ranges()+0x9b: unreachable
instruction
> >  arch/x86/kernel/platform-quirks.o: warning: objtool: x86_early_init_platform_quirks()+0x84:
unreachable instruction
> >  drivers/iommu/irq_remapping.o: warning: objtool: irq_remap_enable_fault_handling()+0x1d:
unreachable instruction

> Unfortunately, we have quite a few of those outstanding, it's probably
> time to start really taking a look at them:
> 
> https://github.com/ClangBuiltLinux/linux/labels/objtool

I just checked two of them in the disassembly. In both cases it's jump
label related. Here is one:

      asm volatile("1: rdmsr\n"
 410:   b9 59 02 00 00          mov    $0x259,%ecx
 415:   0f 32                   rdmsr
 417:   49 89 c6                mov    %rax,%r14
 41a:   48 89 d3                mov    %rdx,%rbx
      return EAX_EDX_VAL(val, low, high);
 41d:   48 c1 e3 20             shl    $0x20,%rbx
 421:   48 09 c3                or     %rax,%rbx
 424:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
 429:   eb 0f                   jmp    43a <get_fixed_ranges+0xaa>
      do_trace_read_msr(msr, val, 0);
 42b:   bf 59 02 00 00          mov    $0x259,%edi   <------- "unreachable"
 430:   48 89 de                mov    %rbx,%rsi
 433:   31 d2                   xor    %edx,%edx
 435:   e8 00 00 00 00          callq  43a <get_fixed_ranges+0xaa>
 43a:   44 89 35 00 00 00 00    mov    %r14d,0x0(%rip)        # 441 <get_fixed_ranges+0xb1>

Interestingly enough there are some more hunks of the same pattern in that
function which look all the same. Those are not upsetting objtool. Josh
might give an hint where to stare at.

Just for the fun of it I looked at the GCC output of the same file. It
takes a different apporach:

      asm volatile("1: rdmsr\n"
 c70:   b9 59 02 00 00          mov    $0x259,%ecx
 c75:   0f 32                   rdmsr
      return EAX_EDX_VAL(val, low, high);
 c77:   48 c1 e2 20             shl    $0x20,%rdx
 c7b:   48 89 d3                mov    %rdx,%rbx
 c7e:   48 09 c3                or     %rax,%rbx
 c81:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
 c86:   48 89 1d 00 00 00 00    mov    %rbx,0x0(%rip)        # c8d <get_fixed_ranges.constprop.5+0x7d>

and the tracing code is completely out of line:

      do_trace_read_msr(msr, val, 0);
 ce2:   31 d2                   xor    %edx,%edx
 ce4:   48 89 de                mov    %rbx,%rsi
 ce7:   bf 59 02 00 00          mov    $0x259,%edi
 cec:   e8 00 00 00 00          callq  cf1 <get_fixed_ranges.constprop.5+0xe1>
 cf1:   eb 93                   jmp    c86 <get_fixed_ranges.constprop.5+0x76>

which makes a lot of sense as the normal path (tracepoint disabled) just
runs through linearly while in the clang version it has to jump around the
tracepoint code.

The jump itself is not a problem, but what matters is the $I cache
footprint. The GCC version hotpath fits in 3 cache lines while the Clang
version unconditionally eats 4.2 of them. That's a huge difference.

> Thanks for trying it out and letting us know. Please keep us in the loop
> if you happen to find anything amiss.

Will do.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25 18:15               ` Nick Desaulniers
@ 2019-06-25 22:29                 ` Joe Perches
  2019-06-25 22:57                   ` Nick Desaulniers
  2019-06-26  8:49                 ` Peter Zijlstra
  1 sibling, 1 reply; 52+ messages in thread
From: Joe Perches @ 2019-06-25 22:29 UTC (permalink / raw)
  To: Nick Desaulniers, Peter Zijlstra
  Cc: Gustavo A. R. Silva, Miguel Ojeda, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Kees Cook, Shawn Landden,
	Nathan Chancellor, Luc Van Oostenryck, Chandler Carruth

On Tue, 2019-06-25 at 11:15 -0700, Nick Desaulniers wrote:
> Unreleased versions of Clang built from source can; the latest release
> of Clang-8 doesn't have asm goto support required for
> CONFIG_JUMP_LABEL.  Things can get complicated based on which kernel
> tree/branch (mainline, -next, stable), arch, and configs, but I think
> we just have a few long tail bugs left.

At some point, when clang generically compiles the kernel,
I believe it'd be good to remove the various bits that
are unusual like CONFIG_CC_HAS_ASM_GOTO in Makefile
and the scripts/clang-version.sh and the like.

This could help when compiling a specific .config on
different systems.

Maybe add the equivalent compiler-gcc.h #define below
even before the removals

(whatever minor/patchlevel appropriate)
---
 include/linux/compiler-clang.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index 333a6695a918..b46aece0f9ca 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -5,6 +5,14 @@
 
 /* Compiler specific definitions for Clang compiler */
 
+#define CLANG_VERSION (__clang_major__ * 10000	\
+		       + __clang_minor__ * 100	\
+		       + __clang_patchlevel__)
+
+#if CLANG_VERSION < 90000
+# error Sorry, your compiler is too old - please upgrade it.
+#endif
+
 #define uninitialized_var(x) x = *(&(x))
 
 /* same as gcc, this was present in clang-2.6 so we can assume it works





^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25 22:29                 ` Joe Perches
@ 2019-06-25 22:57                   ` Nick Desaulniers
  2019-06-25 23:25                     ` Joe Perches
  0 siblings, 1 reply; 52+ messages in thread
From: Nick Desaulniers @ 2019-06-25 22:57 UTC (permalink / raw)
  To: Joe Perches
  Cc: Peter Zijlstra, Gustavo A. R. Silva, Miguel Ojeda, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Kees Cook, Shawn Landden,
	Nathan Chancellor, Luc Van Oostenryck, Chandler Carruth,
	clang-built-linux

On Tue, Jun 25, 2019 at 3:29 PM Joe Perches <joe@perches.com> wrote:
>
> On Tue, 2019-06-25 at 11:15 -0700, Nick Desaulniers wrote:
> > Unreleased versions of Clang built from source can; the latest release
> > of Clang-8 doesn't have asm goto support required for
> > CONFIG_JUMP_LABEL.  Things can get complicated based on which kernel
> > tree/branch (mainline, -next, stable), arch, and configs, but I think
> > we just have a few long tail bugs left.
>
> At some point, when clang generically compiles the kernel,
> I believe it'd be good to remove the various bits that
> are unusual like CONFIG_CC_HAS_ASM_GOTO in Makefile
> and the scripts/clang-version.sh and the like.
>
> This could help when compiling a specific .config on
> different systems.
>
> Maybe add the equivalent compiler-gcc.h #define below
> even before the removals
>
> (whatever minor/patchlevel appropriate)
> ---
>  include/linux/compiler-clang.h | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
> index 333a6695a918..b46aece0f9ca 100644
> --- a/include/linux/compiler-clang.h
> +++ b/include/linux/compiler-clang.h
> @@ -5,6 +5,14 @@
>
>  /* Compiler specific definitions for Clang compiler */
>
> +#define CLANG_VERSION (__clang_major__ * 10000 \
> +                      + __clang_minor__ * 100  \
> +                      + __clang_patchlevel__)
> +
> +#if CLANG_VERSION < 90000
> +# error Sorry, your compiler is too old - please upgrade it.
> +#endif
> +

Heh, I've definitely thought about clang version checks.  And I agree
with your implementation which matches the gcc 4.6 check in
include/linux/compiler-gcc.h.

There are cases when it's appropriate, such as when a certain language
feature has no way of feature detection.  Let's take for example 2
different GNU C extensions.

On one hand, consider __GCC_ASM_FLAG_OUTPUTS__
(https://developers.redhat.com/blog/2016/02/25/new-asm-flags-feature-for-x86-in-gcc-6/).
I like the design of __GCC_ASM_FLAG_OUTPUTS__ because it's
straightforward to do feature detection via preprocessor checks.  This
makes it easy to use the feature when the compiler supports it, or
provide another implementation, regardless of compiler, compiler
version preprocessor defines, etc.  It's compiler agnostic; if the
feature is there then use it or provide a fallback (or error).  I
think the code in include/linux/compiler_attributes.h also exemplifies
this mindset.  (I hope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66970 can be implemented
soon, too, towards this goal of more portable C code).

On the other hand, consider most other GNU C extensions.  How do I
test whether they exist in my compiler or not?  Is it everything or
nothing (do they all have to exist?).  In those cases you either end
up shelling out to something like autoconf (which is what I consider
the current infra around CONFIG_CC_HAS_ASM_GOTO), or code filled with
tons of version checks for specific compilers which are brittle.

Of the two cases, now consider what happens when my compiler that
previously did not support a particular feature now does.  In the
first case, the guards were compiler agnostic, and I *don't have to
change the source* to make use of the feature in the new compiler.  In
the second case, I *need to modify the source* to update the version
checks to be correct.

That's why I consider version checks to be brittle.  Just to hammer
this point away a little more, consider this code in glib for asm goto
detection: https://github.com/GNOME/glib/blob/cbfa776fc149fcc3e351fbdf68c1a299519f4905/glib/gbitlock.c#L182.
This version check literally will not work for clang-9, though it does
support asm goto.  Unfortunately, asm goto doesn't have nice
preprocessor defines like __GCC_ASM_FLAG_OUTPUTS__ does, so someone
literally *needs to edit the source* of glib to make it take advantage
of asm goto in clang-9+ (even though clang-9+ supports the feature in
question).  Feature detection and its benefits over version detection
are well understood in the web development community where devs there
have to worry about implementations from different vendors.

Back to your point about adding a minimal version of Clang to the
kernel; I don't really want to do this.  For non-x86 architectures,
people are happily compiling their kernels with versions of clang as
old as clang-4.  And if it continues to work for them; I'm happy.  And
if it doesn't, and they raise an alarm, we're happy to take a look.
Old LTS distros may have ancient builds of clang, so maybe some kind
of hint would be nice, but I'd also like to support older versions
where we can and I think choosing clang-9 for x86's sake is too
x86-centric.  A version check on CONFIG_JUMP_LABEL is maybe more
appropriate, so it cannot be selected if you're using clang && your
version of clang is not clang-9 or greater?
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25 22:57                   ` Nick Desaulniers
@ 2019-06-25 23:25                     ` Joe Perches
  0 siblings, 0 replies; 52+ messages in thread
From: Joe Perches @ 2019-06-25 23:25 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Peter Zijlstra, Gustavo A. R. Silva, Miguel Ojeda, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Kees Cook, Shawn Landden,
	Nathan Chancellor, Luc Van Oostenryck, Chandler Carruth,
	clang-built-linux

On Tue, 2019-06-25 at 15:57 -0700, Nick Desaulniers wrote:
> consider most other GNU C extensions.  How do I
> test whether they exist in my compiler or not?  Is it everything or
> nothing (do they all have to exist?).

Until such time as the linux source code supports alternate
mechanisms for existing gcc extension uses, I think yes.

> In those cases you either end
> up shelling out to something like autoconf (which is what I consider
> the current infra around CONFIG_CC_HAS_ASM_GOTO), or code filled with
> tons of version checks for specific compilers which are brittle.

Or just one...

> Of the two cases, now consider what happens when my compiler that
> previously did not support a particular feature now does.  In the
> first case, the guards were compiler agnostic, and I *don't have to
> change the source* to make use of the feature in the new compiler.  In
> the second case, I *need to modify the source* to update the version
> checks to be correct.
[]
> Back to your point about adding a minimal version of Clang to the
> kernel; I don't really want to do this.  For non-x86 architectures,
> people are happily compiling their kernels with versions of clang as
> old as clang-4.

Perhaps:

#if defined(CONFIG_X86_32) || defined(CONFIG_X86_64)
#define CLANG_MINIMUM_VERSION 90000
#elif defined(CONFIG_ARM) || defined(CONFIG_ARM64)
#define CLANG_MINIMUM_VERSION 40000
#else
etc...
#endif

#if CLANG_VERSION < CLANG_MINIMUM_VERSION
etc...
#endif

> and if it continues to work for them; I'm happy.  And
> if it doesn't, and they raise an alarm, we're happy to take a look.
> Old LTS distros may have ancient builds of clang, so maybe some kind
> of hint would be nice, but I'd also like to support older versions
> where we can and I think choosing clang-9 for x86's sake is too
> x86-centric.  A version check on CONFIG_JUMP_LABEL is maybe more
> appropriate, so it cannot be selected if you're using clang && your
> version of clang is not clang-9 or greater?

The now non-portable nature of .config files might be
improved.



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25 20:27                   ` Nathan Chancellor
  2019-06-25 20:37                     ` Nick Desaulniers
  2019-06-25 21:47                     ` Thomas Gleixner
@ 2019-06-25 23:46                     ` Arnaldo Carvalho de Melo
  2019-06-26  5:14                       ` Nathan Chancellor
  2 siblings, 1 reply; 52+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-06-25 23:46 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Thomas Gleixner, Kees Cook, Peter Zijlstra, Miguel Ojeda,
	Gustavo A. R. Silva, Joe Perches, Ingo Molnar,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Borislav Petkov,
	H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux

Em Tue, Jun 25, 2019 at 01:27:46PM -0700, Nathan Chancellor escreveu:
> On Tue, Jun 25, 2019 at 09:53:09PM +0200, Thomas Gleixner wrote:
> > On Tue, 25 Jun 2019, Nathan Chancellor wrote:
> > > On Tue, Jun 25, 2019 at 10:12:42AM -0700, Kees Cook wrote:
> > > > On Tue, Jun 25, 2019 at 09:18:46AM +0200, Peter Zijlstra wrote:
> > > > > Can it build a kernel without patches yet? That is, why should I care
> > > > > what LLVM does?
> > > > 
> > > > Yes. LLVM trunk builds and boots x86 now. As for distro availability,
> > > > AIUI, the asm-goto feature missed the 9.0 LLVM branch point, so it'll
> > > > appear in the following release.
> > > > 
> > > > -- 
> > > > Kees Cook
> > > 
> > > I don't think that's right. LLVM 9 hasn't been branched yet so it should
> > > make it in.
> > > 
> > > http://lists.llvm.org/pipermail/llvm-dev/2019-June/133155.html
> > > 
> > > If anyone wants to play around with it before then, we wrote a
> > > self-contained script that will build an LLVM toolchain suitable for
> > > kernel development:
> > > 
> > > https://github.com/ClangBuiltLinux/tc-build
> > 
> > Useful!
> > 
> > But can the script please check for a minimal clang version required to
> > build that thing.
> > 
> > The default clang-3.8 which is installed on Debian stretch explodes. The
> > 6.0 variant from backports works as advertised.
> > 
> 
> Hmmm interesting, I test a lot of different distros using Docker
> containers to make sure the script works universally and that includes
> Debian stretch, which is the stress tester because all of the packages
> are older.

Interesting, I've been building tools/perf, tools/{lib,arch/include},
etc with lots of clang versions for quite a while, all in containers,
using podman:

----------------------------
The first ones are container based builds of tools/perf with and without libelf
support.  Where clang is available, it is also used to build perf with/without
libelf, and building with LIBCLANGLLVM=1 (built-in clang) with gcc and clang
when clang and its devel libraries are installed.

Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

  $ export PERF_TARBALL=http://192.168.124.1/perf/perf-5.2.0-rc4.tar.xz
  $ dm
   1 alpine:3.4                    : Ok   gcc (Alpine 5.3.0) 5.3.0, clang version 3.8.0 (tags/RELEASE_380/final)
   2 alpine:3.5                    : Ok   gcc (Alpine 6.2.1) 6.2.1 20160822, clang version 3.8.1 (tags/RELEASE_381/final)
   3 alpine:3.6                    : Ok   gcc (Alpine 6.3.0) 6.3.0, clang version 4.0.0 (tags/RELEASE_400/final)
   4 alpine:3.7                    : Ok   gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.0 (tags/RELEASE_500/final) (based on LLVM 5.0.0)
   5 alpine:3.8                    : Ok   gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.1 (tags/RELEASE_501/final) (based on LLVM 5.0.1)
   6 alpine:3.9                    : Ok   gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 5.0.1 (tags/RELEASE_502/final) (based on LLVM 5.0.1)
   7 alpine:edge                   : Ok   gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 7.0.1 (tags/RELEASE_701/final) (based on LLVM 7.0.1)
   8 amazonlinux:1                 : Ok   gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2), clang version 3.6.2 (tags/RELEASE_362/final)
   9 amazonlinux:2                 : Ok   gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5), clang version 7.0.1 (Amazon Linux 2 7.0.1-1.amzn2.0.2)
  10 android-ndk:r12b-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
  11 android-ndk:r15c-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
  12 centos:5                      : Ok   gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
  13 centos:6                      : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
  14 centos:7                      : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36)
  15 clearlinux:latest             : Ok   gcc (Clear Linux OS for Intel Architecture) 9.1.1 20190611 gcc-9-branch@272162
  16 debian:8                      : Ok   gcc (Debian 4.9.2-10+deb8u2) 4.9.2, Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
  17 debian:9                      : Ok   gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, clang version 3.8.1-24 (tags/RELEASE_381/final)
  18 debian:experimental           : Ok   gcc (Debian 8.3.0-7) 8.3.0, clang version 7.0.1-8 (tags/RELEASE_701/final)
  19 debian:experimental-x-arm64   : Ok   aarch64-linux-gnu-gcc (Debian 8.3.0-7) 8.3.0
  20 debian:experimental-x-mips    : Ok   mips-linux-gnu-gcc (Debian 8.3.0-7) 8.3.0
  21 debian:experimental-x-mips64  : Ok   mips64-linux-gnuabi64-gcc (Debian 8.3.0-7) 8.3.0
  22 debian:experimental-x-mipsel  : Ok   mipsel-linux-gnu-gcc (Debian 8.3.0-7) 8.3.0
  23 fedora:20                     : Ok   gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7), clang version 3.4.2 (tags/RELEASE_34/dot2-final)
  24 fedora:22                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  25 fedora:23                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), clang version 3.7.0 (tags/RELEASE_370/final)
  26 fedora:24                     : Ok   gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1), clang version 3.8.1 (tags/RELEASE_381/final)
  27 fedora:24-x-ARC-uClibc        : Ok   arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
  28 fedora:25                     : Ok   gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1), clang version 3.9.1 (tags/RELEASE_391/final)
  29 fedora:26                     : Ok   gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2), clang version 4.0.1 (tags/RELEASE_401/final)
  30 fedora:27                     : Ok   gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), clang version 5.0.2 (tags/RELEASE_502/final)
  31 fedora:28                     : Ok   gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 6.0.1 (tags/RELEASE_601/final)
  32 fedora:29                     : Ok   gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 7.0.1 (Fedora 7.0.1-6.fc29)
  33 fedora:30                     : Ok   gcc (GCC) 9.1.1 20190503 (Red Hat 9.1.1-1), clang version 8.0.0 (Fedora 8.0.0-1.fc30)
  34 fedora:30-x-ARC-glibc         : Ok   arc-linux-gcc (ARC HS GNU/Linux glibc toolchain 2019.03-rc1) 8.3.1 20190225
  35 fedora:30-x-ARC-uClibc        : Ok   arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225
  36 fedora:31                     : Ok   gcc (GCC) 9.1.1 20190605 (Red Hat 9.1.1-2), clang version 8.0.0 (Fedora 8.0.0-3.fc31)
  37 fedora:rawhide                : Ok   gcc (GCC) 9.1.1 20190605 (Red Hat 9.1.1-2), clang version 8.0.0 (Fedora 8.0.0-3.fc31)
  38 gentoo-stage3-amd64:latest    : Ok   gcc (Gentoo 8.3.0-r1 p1.1) 8.3.0
  39 mageia:5                      : Ok   gcc (GCC) 4.9.2, clang version 3.5.2 (tags/RELEASE_352/final)
  40 mageia:6                      : Ok   gcc (Mageia 5.5.0-1.mga6) 5.5.0, clang version 3.9.1 (tags/RELEASE_391/final)
  41 mageia:7                      : Ok   gcc (Mageia 8.3.1-0.20190524.1.mga7) 8.3.1 20190524, clang version 8.0.0 (Mageia 8.0.0-1.mga7)
  42 manjaro:latest                : Ok   gcc (GCC) 8.3.0, clang version 8.0.0 (tags/RELEASE_800/final)
   1 openmandriva:cooker           : Ok   gcc (GCC) 9.1.0 20190503 (OpenMandriva)
  43 opensuse:15.0                 : Ok   gcc (SUSE Linux) 7.4.1 20190424 [gcc-7-branch revision 270538], clang version 5.0.1 (tags/RELEASE_501/final 312548)
  44 opensuse:15.1                 : Ok   gcc (SUSE Linux) 7.4.0, clang version 7.0.1 (tags/RELEASE_701/final 349238)
  45 opensuse:42.3                 : Ok   gcc (SUSE Linux) 4.8.5, clang version 3.8.0 (tags/RELEASE_380/final 262553)
  46 opensuse:tumbleweed           : Ok   gcc (SUSE Linux) 9.1.1 20190520 [gcc-9-branch revision 271396], clang version 7.0.1 (tags/RELEASE_701/final 349238)
  47 oraclelinux:6                 : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
  48 oraclelinux:7                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36.0.1), clang version 3.4.2 (tags/RELEASE_34/dot2-final)
  49 ubuntu:12.04                  : Ok   gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
  50 ubuntu:14.04                  : Ok   gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4, Ubuntu clang version 3.4-1ubuntu3 (tags/RELEASE_34/final) (based on LLVM 3.4)
  51 ubuntu:16.04                  : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
  52 ubuntu:16.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  53 ubuntu:16.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  54 ubuntu:16.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  55 ubuntu:16.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  56 ubuntu:16.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  57 ubuntu:16.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  58 ubuntu:18.04                  : Ok   gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0, clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
  59 ubuntu:18.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04) 7.4.0
  60 ubuntu:18.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04) 7.4.0
  61 ubuntu:18.04-x-m68k           : Ok   m68k-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0
  62 ubuntu:18.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0
  63 ubuntu:18.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0
  64 ubuntu:18.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0
  65 ubuntu:18.04-x-riscv64        : Ok   riscv64-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0
  66 ubuntu:18.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0
  67 ubuntu:18.04-x-sh4            : Ok   sh4-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0
  68 ubuntu:18.04-x-sparc64        : Ok   sparc64-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0
  69 ubuntu:18.10                  : Ok   gcc (Ubuntu 8.3.0-6ubuntu1~18.10.1) 8.3.0, clang version 7.0.0-3 (tags/RELEASE_700/final)
  70 ubuntu:19.04                  : Ok   gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0, clang version 8.0.0-3 (tags/RELEASE_800/final)
  71 ubuntu:19.04-x-alpha          : Ok   alpha-linux-gnu-gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0
  72 ubuntu:19.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 8.3.0-6ubuntu1) 8.3.0
  73 ubuntu:19.04-x-hppa           : Ok   hppa-linux-gnu-gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0
  $

- Arnaldo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25 21:47                     ` Thomas Gleixner
@ 2019-06-26  5:10                       ` Nathan Chancellor
  2019-06-26 15:18                         ` Thomas Gleixner
  2019-06-26  9:24                       ` Peter Zijlstra
  2019-06-26 16:30                       ` Peter Zijlstra
  2 siblings, 1 reply; 52+ messages in thread
From: Nathan Chancellor @ 2019-06-26  5:10 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Kees Cook, Peter Zijlstra, Miguel Ojeda, Gustavo A. R. Silva,
	Joe Perches, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Borislav Petkov,
	H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux,
	Josh Poimboeuf

On Tue, Jun 25, 2019 at 11:47:06PM +0200, Thomas Gleixner wrote:
> On Tue, 25 Jun 2019, Nathan Chancellor wrote:
> > On Tue, Jun 25, 2019 at 09:53:09PM +0200, Thomas Gleixner wrote:
> > > 
> > > But can the script please check for a minimal clang version required to
> > > build that thing.
> > > 
> > > The default clang-3.8 which is installed on Debian stretch explodes. The
> > > 6.0 variant from backports works as advertised.
> > > 
> > 
> > Hmmm interesting, I test a lot of different distros using Docker
> > containers to make sure the script works universally and that includes
> > Debian stretch, which is the stress tester because all of the packages
> > are older. I install the following packages then run the following
> > command and it works fine for me (just tested):
> > 
> > $ apt update && apt install -y --no-install-recommends ca-certificates \
> > ccache clang cmake curl file gcc g++ git make ninja-build python3 \
> > texinfo zlib1g-dev
> > $ ./build-llvm.py
> > 
> > If you could give me a build log, I'd be happy to look into it and see
> > what I can do.
> 
> I can produce one tomorrow.
>  

Great, thank you!

> > > Kernel builds with the new shiny compiler. Jump labels seem to be enabled.
> > > 
> > > It complains about a few type conversions:
> > > 
> > >  arch/x86/kvm/mmu.c:4596:39: warning: implicit conversion from 'int' to 'u8' (aka 'unsigned char') changes value from -205 to 51 [-Wconstant-conversion]
> > >                 u8 wf = (pfec & PFERR_WRITE_MASK) ? ~w : 0;
> > >                    ~~                               ^~
> > > 
> > 
> > Yes, there was a patch sent to try and fix this but it was rejected by
> > the maintainers:
> > 
> > https://github.com/ClangBuiltLinux/linux/issues/95
> > 
> > https://lore.kernel.org/lkml/20180619192504.180479-1-mka@chromium.org/
> 
> Just looked through it. I don't think it's an outright reject. Paolo was
> not totally against it and then the whole discussion degraded into bikeshed
> painting and bitching about compiler error messaged. Try again or should I?
> 

Might be worth having you chime in, given that is the only instance of
that type of warning that I see in my set of builds (I fixed the rest:
https://github.com/ClangBuiltLinux/linux/issues?q=label%3A-Wconstant-conversion)

> > > but it also makes objtool unhappy:
> > > 
> > >  arch/x86/events/intel/core.o: warning: objtool: intel_pmu_nhm_workaround()+0xb3: unreachable
> instruction
> > >  kernel/fork.o: warning: objtool: free_thread_stack()+0x126: unreachable instruction
> > >  mm/workingset.o: warning: objtool: count_shadow_nodes()+0x11f: unreachable instruction
> > >  arch/x86/kernel/cpu/mtrr/generic.o: warning: objtool: get_fixed_ranges()+0x9b: unreachable
> instruction
> > >  arch/x86/kernel/platform-quirks.o: warning: objtool: x86_early_init_platform_quirks()+0x84:
> unreachable instruction
> > >  drivers/iommu/irq_remapping.o: warning: objtool: irq_remap_enable_fault_handling()+0x1d:
> unreachable instruction
> 
> > Unfortunately, we have quite a few of those outstanding, it's probably
> > time to start really taking a look at them:
> > 
> > https://github.com/ClangBuiltLinux/linux/labels/objtool
> 
> I just checked two of them in the disassembly. In both cases it's jump
> label related. Here is one:
> 
>       asm volatile("1: rdmsr\n"
>  410:   b9 59 02 00 00          mov    $0x259,%ecx
>  415:   0f 32                   rdmsr
>  417:   49 89 c6                mov    %rax,%r14
>  41a:   48 89 d3                mov    %rdx,%rbx
>       return EAX_EDX_VAL(val, low, high);
>  41d:   48 c1 e3 20             shl    $0x20,%rbx
>  421:   48 09 c3                or     %rax,%rbx
>  424:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
>  429:   eb 0f                   jmp    43a <get_fixed_ranges+0xaa>
>       do_trace_read_msr(msr, val, 0);
>  42b:   bf 59 02 00 00          mov    $0x259,%edi   <------- "unreachable"
>  430:   48 89 de                mov    %rbx,%rsi
>  433:   31 d2                   xor    %edx,%edx
>  435:   e8 00 00 00 00          callq  43a <get_fixed_ranges+0xaa>
>  43a:   44 89 35 00 00 00 00    mov    %r14d,0x0(%rip)        # 441 <get_fixed_ranges+0xb1>
> 
> Interestingly enough there are some more hunks of the same pattern in that
> function which look all the same. Those are not upsetting objtool. Josh
> might give an hint where to stare at.
> 
> Just for the fun of it I looked at the GCC output of the same file. It
> takes a different apporach:
> 
>       asm volatile("1: rdmsr\n"
>  c70:   b9 59 02 00 00          mov    $0x259,%ecx
>  c75:   0f 32                   rdmsr
>       return EAX_EDX_VAL(val, low, high);
>  c77:   48 c1 e2 20             shl    $0x20,%rdx
>  c7b:   48 89 d3                mov    %rdx,%rbx
>  c7e:   48 09 c3                or     %rax,%rbx
>  c81:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
>  c86:   48 89 1d 00 00 00 00    mov    %rbx,0x0(%rip)        # c8d <get_fixed_ranges.constprop.5+0x7d>
> 
> and the tracing code is completely out of line:
> 
>       do_trace_read_msr(msr, val, 0);
>  ce2:   31 d2                   xor    %edx,%edx
>  ce4:   48 89 de                mov    %rbx,%rsi
>  ce7:   bf 59 02 00 00          mov    $0x259,%edi
>  cec:   e8 00 00 00 00          callq  cf1 <get_fixed_ranges.constprop.5+0xe1>
>  cf1:   eb 93                   jmp    c86 <get_fixed_ranges.constprop.5+0x76>
> 
> which makes a lot of sense as the normal path (tracepoint disabled) just
> runs through linearly while in the clang version it has to jump around the
> tracepoint code.
> 
> The jump itself is not a problem, but what matters is the $I cache
> footprint. The GCC version hotpath fits in 3 cache lines while the Clang
> version unconditionally eats 4.2 of them. That's a huge difference.
> 
> > Thanks for trying it out and letting us know. Please keep us in the loop
> > if you happen to find anything amiss.
> 
> Will do.
> 
> Thanks,
> 
> 	tglx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25 23:46                     ` Arnaldo Carvalho de Melo
@ 2019-06-26  5:14                       ` Nathan Chancellor
  0 siblings, 0 replies; 52+ messages in thread
From: Nathan Chancellor @ 2019-06-26  5:14 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Thomas Gleixner, Kees Cook, Peter Zijlstra, Miguel Ojeda,
	Gustavo A. R. Silva, Joe Perches, Ingo Molnar,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Borislav Petkov,
	H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux

On Tue, Jun 25, 2019 at 08:46:26PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Jun 25, 2019 at 01:27:46PM -0700, Nathan Chancellor escreveu:
> > On Tue, Jun 25, 2019 at 09:53:09PM +0200, Thomas Gleixner wrote:
> > > On Tue, 25 Jun 2019, Nathan Chancellor wrote:
> > > > On Tue, Jun 25, 2019 at 10:12:42AM -0700, Kees Cook wrote:
> > > > > On Tue, Jun 25, 2019 at 09:18:46AM +0200, Peter Zijlstra wrote:
> > > > > > Can it build a kernel without patches yet? That is, why should I care
> > > > > > what LLVM does?
> > > > > 
> > > > > Yes. LLVM trunk builds and boots x86 now. As for distro availability,
> > > > > AIUI, the asm-goto feature missed the 9.0 LLVM branch point, so it'll
> > > > > appear in the following release.
> > > > > 
> > > > > -- 
> > > > > Kees Cook
> > > > 
> > > > I don't think that's right. LLVM 9 hasn't been branched yet so it should
> > > > make it in.
> > > > 
> > > > http://lists.llvm.org/pipermail/llvm-dev/2019-June/133155.html
> > > > 
> > > > If anyone wants to play around with it before then, we wrote a
> > > > self-contained script that will build an LLVM toolchain suitable for
> > > > kernel development:
> > > > 
> > > > https://github.com/ClangBuiltLinux/tc-build
> > > 
> > > Useful!
> > > 
> > > But can the script please check for a minimal clang version required to
> > > build that thing.
> > > 
> > > The default clang-3.8 which is installed on Debian stretch explodes. The
> > > 6.0 variant from backports works as advertised.
> > > 
> > 
> > Hmmm interesting, I test a lot of different distros using Docker
> > containers to make sure the script works universally and that includes
> > Debian stretch, which is the stress tester because all of the packages
> > are older.
> 
> Interesting, I've been building tools/perf, tools/{lib,arch/include},
> etc with lots of clang versions for quite a while, all in containers,
> using podman:
> 

Huh, interesting, I'm going to have to check that out (first time I have
heard of podman).

If anyone cares to see it, here is my little crude script:

https://github.com/nathanchance/scripts/blob/d8cfcc05fc50453503d48e32767f24dfac82dcd4/funcs/cbl#L693-L776

Cheers,
Nathan

> ----------------------------
> The first ones are container based builds of tools/perf with and without libelf
> support.  Where clang is available, it is also used to build perf with/without
> libelf, and building with LIBCLANGLLVM=1 (built-in clang) with gcc and clang
> when clang and its devel libraries are installed.
> 
> Several are cross builds, the ones with -x-ARCH and the android one, and those
> may not have all the features built, due to lack of multi-arch devel packages,
> available and being used so far on just a few, like
> debian:experimental-x-{arm64,mipsel}.
> 
>   $ export PERF_TARBALL=http://192.168.124.1/perf/perf-5.2.0-rc4.tar.xz
>   $ dm
>    1 alpine:3.4                    : Ok   gcc (Alpine 5.3.0) 5.3.0, clang version 3.8.0 (tags/RELEASE_380/final)
>    2 alpine:3.5                    : Ok   gcc (Alpine 6.2.1) 6.2.1 20160822, clang version 3.8.1 (tags/RELEASE_381/final)
>    3 alpine:3.6                    : Ok   gcc (Alpine 6.3.0) 6.3.0, clang version 4.0.0 (tags/RELEASE_400/final)
>    4 alpine:3.7                    : Ok   gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.0 (tags/RELEASE_500/final) (based on LLVM 5.0.0)
>    5 alpine:3.8                    : Ok   gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.1 (tags/RELEASE_501/final) (based on LLVM 5.0.1)
>    6 alpine:3.9                    : Ok   gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 5.0.1 (tags/RELEASE_502/final) (based on LLVM 5.0.1)
>    7 alpine:edge                   : Ok   gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 7.0.1 (tags/RELEASE_701/final) (based on LLVM 7.0.1)
>    8 amazonlinux:1                 : Ok   gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2), clang version 3.6.2 (tags/RELEASE_362/final)
>    9 amazonlinux:2                 : Ok   gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5), clang version 7.0.1 (Amazon Linux 2 7.0.1-1.amzn2.0.2)
>   10 android-ndk:r12b-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
>   11 android-ndk:r15c-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
>   12 centos:5                      : Ok   gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
>   13 centos:6                      : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
>   14 centos:7                      : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36)
>   15 clearlinux:latest             : Ok   gcc (Clear Linux OS for Intel Architecture) 9.1.1 20190611 gcc-9-branch@272162
>   16 debian:8                      : Ok   gcc (Debian 4.9.2-10+deb8u2) 4.9.2, Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
>   17 debian:9                      : Ok   gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, clang version 3.8.1-24 (tags/RELEASE_381/final)
>   18 debian:experimental           : Ok   gcc (Debian 8.3.0-7) 8.3.0, clang version 7.0.1-8 (tags/RELEASE_701/final)
>   19 debian:experimental-x-arm64   : Ok   aarch64-linux-gnu-gcc (Debian 8.3.0-7) 8.3.0
>   20 debian:experimental-x-mips    : Ok   mips-linux-gnu-gcc (Debian 8.3.0-7) 8.3.0
>   21 debian:experimental-x-mips64  : Ok   mips64-linux-gnuabi64-gcc (Debian 8.3.0-7) 8.3.0
>   22 debian:experimental-x-mipsel  : Ok   mipsel-linux-gnu-gcc (Debian 8.3.0-7) 8.3.0
>   23 fedora:20                     : Ok   gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7), clang version 3.4.2 (tags/RELEASE_34/dot2-final)
>   24 fedora:22                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
>   25 fedora:23                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), clang version 3.7.0 (tags/RELEASE_370/final)
>   26 fedora:24                     : Ok   gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1), clang version 3.8.1 (tags/RELEASE_381/final)
>   27 fedora:24-x-ARC-uClibc        : Ok   arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
>   28 fedora:25                     : Ok   gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1), clang version 3.9.1 (tags/RELEASE_391/final)
>   29 fedora:26                     : Ok   gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2), clang version 4.0.1 (tags/RELEASE_401/final)
>   30 fedora:27                     : Ok   gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), clang version 5.0.2 (tags/RELEASE_502/final)
>   31 fedora:28                     : Ok   gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 6.0.1 (tags/RELEASE_601/final)
>   32 fedora:29                     : Ok   gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 7.0.1 (Fedora 7.0.1-6.fc29)
>   33 fedora:30                     : Ok   gcc (GCC) 9.1.1 20190503 (Red Hat 9.1.1-1), clang version 8.0.0 (Fedora 8.0.0-1.fc30)
>   34 fedora:30-x-ARC-glibc         : Ok   arc-linux-gcc (ARC HS GNU/Linux glibc toolchain 2019.03-rc1) 8.3.1 20190225
>   35 fedora:30-x-ARC-uClibc        : Ok   arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225
>   36 fedora:31                     : Ok   gcc (GCC) 9.1.1 20190605 (Red Hat 9.1.1-2), clang version 8.0.0 (Fedora 8.0.0-3.fc31)
>   37 fedora:rawhide                : Ok   gcc (GCC) 9.1.1 20190605 (Red Hat 9.1.1-2), clang version 8.0.0 (Fedora 8.0.0-3.fc31)
>   38 gentoo-stage3-amd64:latest    : Ok   gcc (Gentoo 8.3.0-r1 p1.1) 8.3.0
>   39 mageia:5                      : Ok   gcc (GCC) 4.9.2, clang version 3.5.2 (tags/RELEASE_352/final)
>   40 mageia:6                      : Ok   gcc (Mageia 5.5.0-1.mga6) 5.5.0, clang version 3.9.1 (tags/RELEASE_391/final)
>   41 mageia:7                      : Ok   gcc (Mageia 8.3.1-0.20190524.1.mga7) 8.3.1 20190524, clang version 8.0.0 (Mageia 8.0.0-1.mga7)
>   42 manjaro:latest                : Ok   gcc (GCC) 8.3.0, clang version 8.0.0 (tags/RELEASE_800/final)
>    1 openmandriva:cooker           : Ok   gcc (GCC) 9.1.0 20190503 (OpenMandriva)
>   43 opensuse:15.0                 : Ok   gcc (SUSE Linux) 7.4.1 20190424 [gcc-7-branch revision 270538], clang version 5.0.1 (tags/RELEASE_501/final 312548)
>   44 opensuse:15.1                 : Ok   gcc (SUSE Linux) 7.4.0, clang version 7.0.1 (tags/RELEASE_701/final 349238)
>   45 opensuse:42.3                 : Ok   gcc (SUSE Linux) 4.8.5, clang version 3.8.0 (tags/RELEASE_380/final 262553)
>   46 opensuse:tumbleweed           : Ok   gcc (SUSE Linux) 9.1.1 20190520 [gcc-9-branch revision 271396], clang version 7.0.1 (tags/RELEASE_701/final 349238)
>   47 oraclelinux:6                 : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
>   48 oraclelinux:7                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36.0.1), clang version 3.4.2 (tags/RELEASE_34/dot2-final)
>   49 ubuntu:12.04                  : Ok   gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
>   50 ubuntu:14.04                  : Ok   gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4, Ubuntu clang version 3.4-1ubuntu3 (tags/RELEASE_34/final) (based on LLVM 3.4)
>   51 ubuntu:16.04                  : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
>   52 ubuntu:16.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
>   53 ubuntu:16.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
>   54 ubuntu:16.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
>   55 ubuntu:16.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
>   56 ubuntu:16.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
>   57 ubuntu:16.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
>   58 ubuntu:18.04                  : Ok   gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0, clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
>   59 ubuntu:18.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04) 7.4.0
>   60 ubuntu:18.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04) 7.4.0
>   61 ubuntu:18.04-x-m68k           : Ok   m68k-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0
>   62 ubuntu:18.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0
>   63 ubuntu:18.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0
>   64 ubuntu:18.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0
>   65 ubuntu:18.04-x-riscv64        : Ok   riscv64-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0
>   66 ubuntu:18.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0
>   67 ubuntu:18.04-x-sh4            : Ok   sh4-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0
>   68 ubuntu:18.04-x-sparc64        : Ok   sparc64-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0
>   69 ubuntu:18.10                  : Ok   gcc (Ubuntu 8.3.0-6ubuntu1~18.10.1) 8.3.0, clang version 7.0.0-3 (tags/RELEASE_700/final)
>   70 ubuntu:19.04                  : Ok   gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0, clang version 8.0.0-3 (tags/RELEASE_800/final)
>   71 ubuntu:19.04-x-alpha          : Ok   alpha-linux-gnu-gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0
>   72 ubuntu:19.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 8.3.0-6ubuntu1) 8.3.0
>   73 ubuntu:19.04-x-hppa           : Ok   hppa-linux-gnu-gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0
>   $
> 
> - Arnaldo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25 17:12             ` Kees Cook
  2019-06-25 18:05               ` Nathan Chancellor
@ 2019-06-26  8:06               ` Peter Zijlstra
  1 sibling, 0 replies; 52+ messages in thread
From: Peter Zijlstra @ 2019-06-26  8:06 UTC (permalink / raw)
  To: Kees Cook
  Cc: Miguel Ojeda, Gustavo A. R. Silva, Joe Perches, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden

On Tue, Jun 25, 2019 at 10:12:42AM -0700, Kees Cook wrote:
> On Tue, Jun 25, 2019 at 09:18:46AM +0200, Peter Zijlstra wrote:
> > Can it build a kernel without patches yet? That is, why should I care
> > what LLVM does?
> 
> Yes. LLVM trunk builds and boots x86 now. As for distro availability,

If it's not release and not in distros, then no. So then get that same
trunk to support the __fallthrough crap so it all lands in the same
release and we done ;-)

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25 18:15               ` Nick Desaulniers
  2019-06-25 22:29                 ` Joe Perches
@ 2019-06-26  8:49                 ` Peter Zijlstra
  2019-06-26 22:14                   ` Nick Desaulniers
  1 sibling, 1 reply; 52+ messages in thread
From: Peter Zijlstra @ 2019-06-26  8:49 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Gustavo A. R. Silva, Joe Perches, Miguel Ojeda, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Kees Cook, Shawn Landden,
	Nathan Chancellor, Luc Van Oostenryck, Chandler Carruth

On Tue, Jun 25, 2019 at 11:15:57AM -0700, Nick Desaulniers wrote:

> Unreleased versions of Clang built from source can;

I've bad experiences with using unreleased compilers; life is too short.

> We're currently planning multiple output constraint support w/ asm
> goto, and have recently implemented things like
> __GCC_ASM_FLAG_OUTPUTS__.

That's good to hear.

> If there's other features that we should
> start implementing, please let us know.

If you've got any ideas on how to make this:

  https://lkml.kernel.org/r/20190621120923.GT3463@hirez.programming.kicks-ass.net

work, that'd be nice. Basically I wanted the asm goto to emit a 2 or 5
byte JMP/NOP depending on the displacement size. We can trivially get
JMP right by using:

	jmp \l_yes

and letting the assembler sort it, but getting the NOP right has so far
eluded me:

.if \l_yes - (. + 2) < 127
	.byte 0x66, 0x90
.else
	.byte STATIC_KEY_INIT_NOP
.endif

doesn't work. We can ofcourse unconditionally emit the JMP and then
rewrite the binary afterward, and replace the emitted jumps with the
right size NOP, but that's a bit yuck.

Once it emits the variable size instruction consistently, we can update
the patching side to use the same condition to select the new
instruction (and fix objtool).

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25 21:47                     ` Thomas Gleixner
  2019-06-26  5:10                       ` Nathan Chancellor
@ 2019-06-26  9:24                       ` Peter Zijlstra
  2019-06-26  9:55                         ` Peter Zijlstra
                                           ` (2 more replies)
  2019-06-26 16:30                       ` Peter Zijlstra
  2 siblings, 3 replies; 52+ messages in thread
From: Peter Zijlstra @ 2019-06-26  9:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Nathan Chancellor, Kees Cook, Miguel Ojeda, Gustavo A. R. Silva,
	Joe Perches, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Borislav Petkov,
	H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux,
	Josh Poimboeuf

On Tue, Jun 25, 2019 at 11:47:06PM +0200, Thomas Gleixner wrote:
> > On Tue, Jun 25, 2019 at 09:53:09PM +0200, Thomas Gleixner wrote:

> > > but it also makes objtool unhappy:
> > > 
> > >  arch/x86/events/intel/core.o: warning: objtool: intel_pmu_nhm_workaround()+0xb3: unreachable instruction
> > >  kernel/fork.o: warning: objtool: free_thread_stack()+0x126: unreachable instruction
> > >  mm/workingset.o: warning: objtool: count_shadow_nodes()+0x11f: unreachable instruction
> > >  arch/x86/kernel/cpu/mtrr/generic.o: warning: objtool: get_fixed_ranges()+0x9b: unreachable instruction
> > >  arch/x86/kernel/platform-quirks.o: warning: objtool: x86_early_init_platform_quirks()+0x84: unreachable instruction
> > >  drivers/iommu/irq_remapping.o: warning: objtool: irq_remap_enable_fault_handling()+0x1d: unreachable instruction

> I just checked two of them in the disassembly. In both cases it's jump
> label related. Here is one:
> 
>       asm volatile("1: rdmsr\n"
>  410:   b9 59 02 00 00          mov    $0x259,%ecx
>  415:   0f 32                   rdmsr
>  417:   49 89 c6                mov    %rax,%r14
>  41a:   48 89 d3                mov    %rdx,%rbx
>       return EAX_EDX_VAL(val, low, high);
>  41d:   48 c1 e3 20             shl    $0x20,%rbx
>  421:   48 09 c3                or     %rax,%rbx
>  424:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
>  429:   eb 0f                   jmp    43a <get_fixed_ranges+0xaa>
>       do_trace_read_msr(msr, val, 0);
>  42b:   bf 59 02 00 00          mov    $0x259,%edi   <------- "unreachable"
>  430:   48 89 de                mov    %rbx,%rsi
>  433:   31 d2                   xor    %edx,%edx
>  435:   e8 00 00 00 00          callq  43a <get_fixed_ranges+0xaa>
>  43a:   44 89 35 00 00 00 00    mov    %r14d,0x0(%rip)        # 441 <get_fixed_ranges+0xb1>
> 
> Interestingly enough there are some more hunks of the same pattern in that
> function which look all the same. Those are not upsetting objtool. Josh
> might give an hint where to stare at.

That's pretty atrocious code-gen :/ Does LLVM support things like label
attributes? Back when we did jump labels GCC didn't, or rather, it
ignored it completely when combined with asm goto (and it might still).

That is, would something like this:

diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
index 06c3cc22a058..1761b1e76ddc 100644
--- a/arch/x86/include/asm/jump_label.h
+++ b/arch/x86/include/asm/jump_label.h
@@ -32,7 +32,7 @@ static __always_inline bool arch_static_branch(struct static_key *key, bool bran
 		: :  "i" (key), "i" (branch) : : l_yes);
 
 	return false;
-l_yes:
+l_yes: __attribute__((cold));
 	return true;
 }
 
@@ -49,7 +49,7 @@ static __always_inline bool arch_static_branch_jump(struct static_key *key, bool
 		: :  "i" (key), "i" (branch) : : l_yes);
 
 	return false;
-l_yes:
+l_yes: __attribute__((hot));
 	return true;
 }
 
Help LLVM?

Still, objtool should be able to deal with that code.

> Just for the fun of it I looked at the GCC output of the same file. It
> takes a different apporach:
> 
>       asm volatile("1: rdmsr\n"
>  c70:   b9 59 02 00 00          mov    $0x259,%ecx
>  c75:   0f 32                   rdmsr
>       return EAX_EDX_VAL(val, low, high);
>  c77:   48 c1 e2 20             shl    $0x20,%rdx
>  c7b:   48 89 d3                mov    %rdx,%rbx
>  c7e:   48 09 c3                or     %rax,%rbx
>  c81:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
>  c86:   48 89 1d 00 00 00 00    mov    %rbx,0x0(%rip)        # c8d <get_fixed_ranges.constprop.5+0x7d>
> 
> and the tracing code is completely out of line:
> 
>       do_trace_read_msr(msr, val, 0);
>  ce2:   31 d2                   xor    %edx,%edx
>  ce4:   48 89 de                mov    %rbx,%rsi
>  ce7:   bf 59 02 00 00          mov    $0x259,%edi
>  cec:   e8 00 00 00 00          callq  cf1 <get_fixed_ranges.constprop.5+0xe1>
>  cf1:   eb 93                   jmp    c86 <get_fixed_ranges.constprop.5+0x76>
> 
> which makes a lot of sense as the normal path (tracepoint disabled) just
> runs through linearly while in the clang version it has to jump around the
> tracepoint code.
> 
> The jump itself is not a problem, but what matters is the $I cache
> footprint. The GCC version hotpath fits in 3 cache lines while the Clang
> version unconditionally eats 4.2 of them. That's a huge difference.

Yeah, this is the right and expected code-gen.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-26  9:24                       ` Peter Zijlstra
@ 2019-06-26  9:55                         ` Peter Zijlstra
  2019-06-26 22:23                           ` Nick Desaulniers
  2019-06-26 10:43                         ` Peter Zijlstra
  2019-06-26 22:15                         ` Nick Desaulniers
  2 siblings, 1 reply; 52+ messages in thread
From: Peter Zijlstra @ 2019-06-26  9:55 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Nathan Chancellor, Kees Cook, Miguel Ojeda, Gustavo A. R. Silva,
	Joe Perches, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Borislav Petkov,
	H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux,
	Josh Poimboeuf

On Wed, Jun 26, 2019 at 11:24:32AM +0200, Peter Zijlstra wrote:
> On Tue, Jun 25, 2019 at 11:47:06PM +0200, Thomas Gleixner wrote:
> > > On Tue, Jun 25, 2019 at 09:53:09PM +0200, Thomas Gleixner wrote:
> 
> > > > but it also makes objtool unhappy:
> > > > 
> > > >  arch/x86/events/intel/core.o: warning: objtool: intel_pmu_nhm_workaround()+0xb3: unreachable instruction
> > > >  kernel/fork.o: warning: objtool: free_thread_stack()+0x126: unreachable instruction
> > > >  mm/workingset.o: warning: objtool: count_shadow_nodes()+0x11f: unreachable instruction
> > > >  arch/x86/kernel/cpu/mtrr/generic.o: warning: objtool: get_fixed_ranges()+0x9b: unreachable instruction
> > > >  arch/x86/kernel/platform-quirks.o: warning: objtool: x86_early_init_platform_quirks()+0x84: unreachable instruction
> > > >  drivers/iommu/irq_remapping.o: warning: objtool: irq_remap_enable_fault_handling()+0x1d: unreachable instruction
> 
> > I just checked two of them in the disassembly. In both cases it's jump
> > label related. Here is one:
> > 
> >       asm volatile("1: rdmsr\n"
> >  410:   b9 59 02 00 00          mov    $0x259,%ecx
> >  415:   0f 32                   rdmsr
> >  417:   49 89 c6                mov    %rax,%r14
> >  41a:   48 89 d3                mov    %rdx,%rbx
> >       return EAX_EDX_VAL(val, low, high);
> >  41d:   48 c1 e3 20             shl    $0x20,%rbx
> >  421:   48 09 c3                or     %rax,%rbx
> >  424:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
> >  429:   eb 0f                   jmp    43a <get_fixed_ranges+0xaa>
> >       do_trace_read_msr(msr, val, 0);
> >  42b:   bf 59 02 00 00          mov    $0x259,%edi   <------- "unreachable"
> >  430:   48 89 de                mov    %rbx,%rsi
> >  433:   31 d2                   xor    %edx,%edx
> >  435:   e8 00 00 00 00          callq  43a <get_fixed_ranges+0xaa>
> >  43a:   44 89 35 00 00 00 00    mov    %r14d,0x0(%rip)        # 441 <get_fixed_ranges+0xb1>
> > 
> > Interestingly enough there are some more hunks of the same pattern in that
> > function which look all the same. Those are not upsetting objtool. Josh
> > might give an hint where to stare at.
> 
> That's pretty atrocious code-gen :/ 

And I know nobody reads comments (I don't either), but I did write one
on this as it happens.

See include/linux/jump_label.h:399

/*
 * Combine the right initial value (type) with the right branch order
 * to generate the desired result.
 *
 *
 * type\branch|	likely (1)	      |	unlikely (0)
 * -----------+-----------------------+------------------
 *            |                       |
 *  true (1)  |	   ...		      |	   ...
 *            |    NOP		      |	   JMP L
 *            |    <br-stmts>	      |	1: ...
 *            |	L: ...		      |
 *            |			      |
 *            |			      |	L: <br-stmts>
 *            |			      |	   jmp 1b
 *            |                       |
 * -----------+-----------------------+------------------
 *            |                       |
 *  false (0) |	   ...		      |	   ...
 *            |    JMP L	      |	   NOP
 *            |    <br-stmts>	      |	1: ...
 *            |	L: ...		      |
 *            |			      |
 *            |			      |	L: <br-stmts>
 *            |			      |	   jmp 1b
 *            |                       |
 * -----------+-----------------------+------------------
 *
 * The initial value is encoded in the LSB of static_key::entries,
 * type: 0 = false, 1 = true.
 *
 * The branch type is encoded in the LSB of jump_entry::key,
 * branch: 0 = unlikely, 1 = likely.
 *
 * This gives the following logic table:
 *
 *	enabled	type	branch	  instuction
 * -----------------------------+-----------
 *	0	0	0	| NOP
 *	0	0	1	| JMP
 *	0	1	0	| NOP
 *	0	1	1	| JMP
 *
 *	1	0	0	| JMP
 *	1	0	1	| NOP
 *	1	1	0	| JMP
 *	1	1	1	| NOP
 *
 * Which gives the following functions:
 *
 *   dynamic: instruction = enabled ^ branch
 *   static:  instruction = type ^ branch
 *
 * See jump_label_type() / jump_label_init_type().
 */

The case here is false-unlikely, which *should* then result in the
branch statements going out-of-line.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-26  9:24                       ` Peter Zijlstra
  2019-06-26  9:55                         ` Peter Zijlstra
@ 2019-06-26 10:43                         ` Peter Zijlstra
  2019-06-26 22:15                         ` Nick Desaulniers
  2 siblings, 0 replies; 52+ messages in thread
From: Peter Zijlstra @ 2019-06-26 10:43 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Nathan Chancellor, Kees Cook, Miguel Ojeda, Gustavo A. R. Silva,
	Joe Perches, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Borislav Petkov,
	H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux,
	Josh Poimboeuf

On Wed, Jun 26, 2019 at 11:24:32AM +0200, Peter Zijlstra wrote:
> That is, would something like this:
> 
> diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
> index 06c3cc22a058..1761b1e76ddc 100644
> --- a/arch/x86/include/asm/jump_label.h
> +++ b/arch/x86/include/asm/jump_label.h
> @@ -32,7 +32,7 @@ static __always_inline bool arch_static_branch(struct static_key *key, bool bran
>  		: :  "i" (key), "i" (branch) : : l_yes);
>  
>  	return false;
> -l_yes:
> +l_yes: __attribute__((cold));
>  	return true;
>  }
>  
> @@ -49,7 +49,7 @@ static __always_inline bool arch_static_branch_jump(struct static_key *key, bool
>  		: :  "i" (key), "i" (branch) : : l_yes);
>  
>  	return false;
> -l_yes:
> +l_yes: __attribute__((hot));
>  	return true;
>  }
>  
> Help LLVM?

No, that's broken. What we do is the likely() and unlikely() hints in
static_branch_{,un}likely():

#define static_branch_likely(x)							\
({										\
	bool branch;								\
	if (__builtin_types_compatible_p(typeof(*x), struct static_key_true))	\
		branch = !arch_static_branch(&(x)->key, true);			\
	else if (__builtin_types_compatible_p(typeof(*x), struct static_key_false)) \
		branch = !arch_static_branch_jump(&(x)->key, true);		\
	else									\
		branch = ____wrong_branch_error();				\
	likely(branch);								\
})

#define static_branch_unlikely(x)						\
({										\
	bool branch;								\
	if (__builtin_types_compatible_p(typeof(*x), struct static_key_true))	\
		branch = arch_static_branch_jump(&(x)->key, false);		\
	else if (__builtin_types_compatible_p(typeof(*x), struct static_key_false)) \
		branch = arch_static_branch(&(x)->key, false);			\
	else									\
		branch = ____wrong_branch_error();				\
	unlikely(branch);							\
})

That should convey out-of-line vs in-line 'hint'. And clearly that's
broken for LLVM.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-26  5:10                       ` Nathan Chancellor
@ 2019-06-26 15:18                         ` Thomas Gleixner
  2019-06-26 19:00                           ` Nathan Chancellor
  0 siblings, 1 reply; 52+ messages in thread
From: Thomas Gleixner @ 2019-06-26 15:18 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Kees Cook, Peter Zijlstra, Miguel Ojeda, Gustavo A. R. Silva,
	Joe Perches, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Borislav Petkov,
	H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux,
	Josh Poimboeuf

On Tue, 25 Jun 2019, Nathan Chancellor wrote:
> On Tue, Jun 25, 2019 at 11:47:06PM +0200, Thomas Gleixner wrote:
> > On Tue, 25 Jun 2019, Nathan Chancellor wrote:
> > > On Tue, Jun 25, 2019 at 09:53:09PM +0200, Thomas Gleixner wrote:
> > > > 
> > > > But can the script please check for a minimal clang version required to
> > > > build that thing.
> > > > 
> > > > The default clang-3.8 which is installed on Debian stretch explodes. The
> > > > 6.0 variant from backports works as advertised.
> > > > 
> > > 
> > > Hmmm interesting, I test a lot of different distros using Docker
> > > containers to make sure the script works universally and that includes
> > > Debian stretch, which is the stress tester because all of the packages
> > > are older. I install the following packages then run the following
> > > command and it works fine for me (just tested):
> > > 
> > > $ apt update && apt install -y --no-install-recommends ca-certificates \
> > > ccache clang cmake curl file gcc g++ git make ninja-build python3 \
> > > texinfo zlib1g-dev
> > > $ ./build-llvm.py
> > > 
> > > If you could give me a build log, I'd be happy to look into it and see
> > > what I can do.
> > 
> > I can produce one tomorrow.

tarball with log and the preprocessed source and run scripts:

    https://tglx.de/~tglx/tc-crash.tar.bz2

The machine runs up to date debian stretch which has backports enabled and
I just used the install command from the github project page you linked
to. Getting started section.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-25 21:47                     ` Thomas Gleixner
  2019-06-26  5:10                       ` Nathan Chancellor
  2019-06-26  9:24                       ` Peter Zijlstra
@ 2019-06-26 16:30                       ` Peter Zijlstra
  2019-06-26 22:33                         ` Nick Desaulniers
  2 siblings, 1 reply; 52+ messages in thread
From: Peter Zijlstra @ 2019-06-26 16:30 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Nathan Chancellor, Kees Cook, Miguel Ojeda, Gustavo A. R. Silva,
	Joe Perches, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Borislav Petkov,
	H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux,
	Josh Poimboeuf

On Tue, Jun 25, 2019 at 11:47:06PM +0200, Thomas Gleixner wrote:
> > On Tue, Jun 25, 2019 at 09:53:09PM +0200, Thomas Gleixner wrote:

> > > but it also makes objtool unhappy:

> > >  arch/x86/kernel/cpu/mtrr/generic.o: warning: objtool: get_fixed_ranges()+0x9b: unreachable instruction

> I just checked two of them in the disassembly. In both cases it's jump
> label related. Here is one:
> 
>       asm volatile("1: rdmsr\n"
>  410:   b9 59 02 00 00          mov    $0x259,%ecx
>  415:   0f 32                   rdmsr
>  417:   49 89 c6                mov    %rax,%r14
>  41a:   48 89 d3                mov    %rdx,%rbx
>       return EAX_EDX_VAL(val, low, high);
>  41d:   48 c1 e3 20             shl    $0x20,%rbx
>  421:   48 09 c3                or     %rax,%rbx
>  424:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
>  429:   eb 0f                   jmp    43a <get_fixed_ranges+0xaa>
>       do_trace_read_msr(msr, val, 0);
>  42b:   bf 59 02 00 00          mov    $0x259,%edi   <------- "unreachable"
>  430:   48 89 de                mov    %rbx,%rsi
>  433:   31 d2                   xor    %edx,%edx
>  435:   e8 00 00 00 00          callq  43a <get_fixed_ranges+0xaa>
>  43a:   44 89 35 00 00 00 00    mov    %r14d,0x0(%rip)        # 441 <get_fixed_ranges+0xb1>

Thomas provided the actual .o file, and from that we find that the
.rela__jump_table entries look like:

000000000010  000100000002 R_X86_64_PC32     0000000000000000 .text + 3e9
000000000014  000100000002 R_X86_64_PC32     0000000000000000 .text + 3f0
000000000018  006100000018 R_X86_64_PC64     0000000000000000 __tracepoint_read_msr + 8
000000000020  000100000002 R_X86_64_PC32     0000000000000000 .text + 424
000000000024  000100000002 R_X86_64_PC32     0000000000000000 .text + 3f0
000000000028  006100000018 R_X86_64_PC64     0000000000000000 __tracepoint_read_msr + 8

From this we find that the jump target that goes with the NOP at +424 is
+3f0, not +42b as one would expect.

And as Josh noted, it is also 'weird' that this +3f0 is the very same as
the target for the previous entry.

When we compare the code at both sites, we find:

3f0:       bf 58 02 00 00          mov    $0x258,%edi
3f5:       48 89 de                mov    %rbx,%rsi
3f8:       31 d2                   xor    %edx,%edx
3fa:       e8 00 00 00 00          callq  3ff <get_fixed_ranges+0x6f>
		3fb: R_X86_64_PC32      do_trace_read_msr-0x4

vs

42b:       bf 59 02 00 00          mov    $0x259,%edi
430:       48 89 de                mov    %rbx,%rsi
433:       31 d2                   xor    %edx,%edx
435:       e8 00 00 00 00          callq  43a <get_fixed_ranges+0xaa>
		436: R_X86_64_PC32      do_trace_read_msr-0x4

Which is not in fact the same code.

So for some reason the .rela__jump_table are buggy on this clang build.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-26 15:18                         ` Thomas Gleixner
@ 2019-06-26 19:00                           ` Nathan Chancellor
  2019-06-26 19:46                             ` Thomas Gleixner
  0 siblings, 1 reply; 52+ messages in thread
From: Nathan Chancellor @ 2019-06-26 19:00 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Kees Cook, Peter Zijlstra, Miguel Ojeda, Gustavo A. R. Silva,
	Joe Perches, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Borislav Petkov,
	H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux,
	Josh Poimboeuf

On Wed, Jun 26, 2019 at 05:18:37PM +0200, Thomas Gleixner wrote:
> tarball with log and the preprocessed source and run scripts:
> 
>     https://tglx.de/~tglx/tc-crash.tar.bz2
> 
> The machine runs up to date debian stretch which has backports enabled and
> I just used the install command from the github project page you linked
> to. Getting started section.
> 
> Thanks,
> 
> 	tglx

Great, thank you! It explodes during lowering, which is a backend issue
so that's fun :/

My guess is that this is a problem with -march=native on that version of
LLVM (since a newer one works). Could you try this patch that makes that
opt-in and see if that fixes it?

https://github.com/nathanchance/tc-build/commit/9f1ae41cd4246f9e4d011542f094aa0df2c069b4

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-26 19:00                           ` Nathan Chancellor
@ 2019-06-26 19:46                             ` Thomas Gleixner
  2019-06-26 20:03                               ` Nathan Chancellor
  0 siblings, 1 reply; 52+ messages in thread
From: Thomas Gleixner @ 2019-06-26 19:46 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Kees Cook, Peter Zijlstra, Miguel Ojeda, Gustavo A. R. Silva,
	Joe Perches, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Borislav Petkov,
	H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux,
	Josh Poimboeuf

On Wed, 26 Jun 2019, Nathan Chancellor wrote:
> On Wed, Jun 26, 2019 at 05:18:37PM +0200, Thomas Gleixner wrote:
> > tarball with log and the preprocessed source and run scripts:
> > 
> >     https://tglx.de/~tglx/tc-crash.tar.bz2
> > 
> > The machine runs up to date debian stretch which has backports enabled and
> > I just used the install command from the github project page you linked
> > to. Getting started section.
> > 
> Great, thank you! It explodes during lowering, which is a backend issue
> so that's fun :/
> 
> My guess is that this is a problem with -march=native on that version of
> LLVM (since a newer one works). Could you try this patch that makes that
> opt-in and see if that fixes it?
> 
> https://github.com/nathanchance/tc-build/commit/9f1ae41cd4246f9e4d011542f094aa0df2c069b4

clang --version
clang version 3.8.1-24 (tags/RELEASE_381/final)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

./build-llvm.py
...
LLVM build duration: 0:03:14

Tested-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-26 19:46                             ` Thomas Gleixner
@ 2019-06-26 20:03                               ` Nathan Chancellor
  0 siblings, 0 replies; 52+ messages in thread
From: Nathan Chancellor @ 2019-06-26 20:03 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Kees Cook, Peter Zijlstra, Miguel Ojeda, Gustavo A. R. Silva,
	Joe Perches, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Borislav Petkov,
	H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux,
	Josh Poimboeuf

On Wed, Jun 26, 2019 at 09:46:02PM +0200, Thomas Gleixner wrote:
> On Wed, 26 Jun 2019, Nathan Chancellor wrote:
> > On Wed, Jun 26, 2019 at 05:18:37PM +0200, Thomas Gleixner wrote:
> > > tarball with log and the preprocessed source and run scripts:
> > > 
> > >     https://tglx.de/~tglx/tc-crash.tar.bz2
> > > 
> > > The machine runs up to date debian stretch which has backports enabled and
> > > I just used the install command from the github project page you linked
> > > to. Getting started section.
> > > 
> > Great, thank you! It explodes during lowering, which is a backend issue
> > so that's fun :/
> > 
> > My guess is that this is a problem with -march=native on that version of
> > LLVM (since a newer one works). Could you try this patch that makes that
> > opt-in and see if that fixes it?
> > 
> > https://github.com/nathanchance/tc-build/commit/9f1ae41cd4246f9e4d011542f094aa0df2c069b4
> 
> clang --version
> clang version 3.8.1-24 (tags/RELEASE_381/final)
> Target: x86_64-pc-linux-gnu
> Thread model: posix
> InstalledDir: /usr/bin
> 
> ./build-llvm.py
> ...
> LLVM build duration: 0:03:14
> 
> Tested-by: Thomas Gleixner <tglx@linutronix.de>

Fantastic, thank you!

https://github.com/ClangBuiltLinux/tc-build/pull/19

Should be merged soon hopefully.

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-26  8:49                 ` Peter Zijlstra
@ 2019-06-26 22:14                   ` Nick Desaulniers
  2019-06-27  7:12                     ` Peter Zijlstra
  0 siblings, 1 reply; 52+ messages in thread
From: Nick Desaulniers @ 2019-06-26 22:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Gustavo A. R. Silva, Joe Perches, Miguel Ojeda, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Kees Cook, Shawn Landden,
	Nathan Chancellor, Luc Van Oostenryck, Chandler Carruth

On Wed, Jun 26, 2019 at 1:49 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Jun 25, 2019 at 11:15:57AM -0700, Nick Desaulniers wrote:
>
> > Unreleased versions of Clang built from source can;
>
> I've bad experiences with using unreleased compilers; life is too short.

Yes; but before release is when they need the help the most in order
for testing to find regressions.

>
> > We're currently planning multiple output constraint support w/ asm
> > goto, and have recently implemented things like
> > __GCC_ASM_FLAG_OUTPUTS__.
>
> That's good to hear.
>
> > If there's other features that we should
> > start implementing, please let us know.
>
> If you've got any ideas on how to make this:
>
>   https://lkml.kernel.org/r/20190621120923.GT3463@hirez.programming.kicks-ass.net
>
> work, that'd be nice. Basically I wanted the asm goto to emit a 2 or 5
> byte JMP/NOP depending on the displacement size. We can trivially get
> JMP right by using:
>
>         jmp \l_yes
>
> and letting the assembler sort it, but getting the NOP right has so far
> eluded me:
>
> .if \l_yes - (. + 2) < 127
>         .byte 0x66, 0x90
> .else
>         .byte STATIC_KEY_INIT_NOP
> .endif
>
> doesn't work. We can ofcourse unconditionally emit the JMP and then
> rewrite the binary afterward, and replace the emitted jumps with the
> right size NOP, but that's a bit yuck.
>
> Once it emits the variable size instruction consistently, we can update
> the patching side to use the same condition to select the new
> instruction (and fix objtool).

Not sure; the assembler directives and their requirements aren't
something I'm too familiar with.

-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-26  9:24                       ` Peter Zijlstra
  2019-06-26  9:55                         ` Peter Zijlstra
  2019-06-26 10:43                         ` Peter Zijlstra
@ 2019-06-26 22:15                         ` Nick Desaulniers
  2019-06-27  7:16                           ` Peter Zijlstra
  2 siblings, 1 reply; 52+ messages in thread
From: Nick Desaulniers @ 2019-06-26 22:15 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Nathan Chancellor, Kees Cook, Miguel Ojeda,
	Gustavo A. R. Silva, Joe Perches, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux,
	Josh Poimboeuf

On Wed, Jun 26, 2019 at 2:24 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Jun 25, 2019 at 11:47:06PM +0200, Thomas Gleixner wrote:
> > > On Tue, Jun 25, 2019 at 09:53:09PM +0200, Thomas Gleixner wrote:
>
> > > > but it also makes objtool unhappy:
> > > >
> > > >  arch/x86/events/intel/core.o: warning: objtool: intel_pmu_nhm_workaround()+0xb3: unreachable instruction
> > > >  kernel/fork.o: warning: objtool: free_thread_stack()+0x126: unreachable instruction
> > > >  mm/workingset.o: warning: objtool: count_shadow_nodes()+0x11f: unreachable instruction
> > > >  arch/x86/kernel/cpu/mtrr/generic.o: warning: objtool: get_fixed_ranges()+0x9b: unreachable instruction
> > > >  arch/x86/kernel/platform-quirks.o: warning: objtool: x86_early_init_platform_quirks()+0x84: unreachable instruction
> > > >  drivers/iommu/irq_remapping.o: warning: objtool: irq_remap_enable_fault_handling()+0x1d: unreachable instruction
>
> > I just checked two of them in the disassembly. In both cases it's jump
> > label related. Here is one:
> >
> >       asm volatile("1: rdmsr\n"
> >  410:   b9 59 02 00 00          mov    $0x259,%ecx
> >  415:   0f 32                   rdmsr
> >  417:   49 89 c6                mov    %rax,%r14
> >  41a:   48 89 d3                mov    %rdx,%rbx
> >       return EAX_EDX_VAL(val, low, high);
> >  41d:   48 c1 e3 20             shl    $0x20,%rbx
> >  421:   48 09 c3                or     %rax,%rbx
> >  424:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
> >  429:   eb 0f                   jmp    43a <get_fixed_ranges+0xaa>
> >       do_trace_read_msr(msr, val, 0);
> >  42b:   bf 59 02 00 00          mov    $0x259,%edi   <------- "unreachable"
> >  430:   48 89 de                mov    %rbx,%rsi
> >  433:   31 d2                   xor    %edx,%edx
> >  435:   e8 00 00 00 00          callq  43a <get_fixed_ranges+0xaa>
> >  43a:   44 89 35 00 00 00 00    mov    %r14d,0x0(%rip)        # 441 <get_fixed_ranges+0xb1>
> >
> > Interestingly enough there are some more hunks of the same pattern in that
> > function which look all the same. Those are not upsetting objtool. Josh
> > might give an hint where to stare at.
>
> That's pretty atrocious code-gen :/ Does LLVM support things like label
> attributes? Back when we did jump labels GCC didn't, or rather, it
> ignored it completely when combined with asm goto (and it might still).
>
> That is, would something like this:
>
> diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
> index 06c3cc22a058..1761b1e76ddc 100644
> --- a/arch/x86/include/asm/jump_label.h
> +++ b/arch/x86/include/asm/jump_label.h
> @@ -32,7 +32,7 @@ static __always_inline bool arch_static_branch(struct static_key *key, bool bran
>                 : :  "i" (key), "i" (branch) : : l_yes);
>
>         return false;
> -l_yes:
> +l_yes: __attribute__((cold));
>         return true;
>  }
>
> @@ -49,7 +49,7 @@ static __always_inline bool arch_static_branch_jump(struct static_key *key, bool
>                 : :  "i" (key), "i" (branch) : : l_yes);
>
>         return false;
> -l_yes:
> +l_yes: __attribute__((hot));
>         return true;
>  }
>
> Help LLVM?

So Clang definitely complains about putting attribute hot/cold on
labels: https://godbolt.org/z/N-Z33Q
In my test case I wasn't able to influence code gen with them though
in GCC at -O2 or -O0.  Maybe GCC has a test case that shows how they
should work?
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-26  9:55                         ` Peter Zijlstra
@ 2019-06-26 22:23                           ` Nick Desaulniers
  2019-06-27  7:35                             ` Peter Zijlstra
  0 siblings, 1 reply; 52+ messages in thread
From: Nick Desaulniers @ 2019-06-26 22:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Nathan Chancellor, Kees Cook, Miguel Ojeda,
	Gustavo A. R. Silva, Joe Perches, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux,
	Josh Poimboeuf

On Wed, Jun 26, 2019 at 2:55 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Jun 26, 2019 at 11:24:32AM +0200, Peter Zijlstra wrote:
> > That's pretty atrocious code-gen :/
>
> And I know nobody reads comments (I don't either), but I did write one
> on this as it happens.

I've definitely read that block in include/linux/jump_label.h; can't
say I fully understand it yet, but appreciate patience and
explanations.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-26 16:30                       ` Peter Zijlstra
@ 2019-06-26 22:33                         ` Nick Desaulniers
  2019-06-26 23:11                           ` Thomas Gleixner
  2019-06-27  7:11                           ` Peter Zijlstra
  0 siblings, 2 replies; 52+ messages in thread
From: Nick Desaulniers @ 2019-06-26 22:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Nathan Chancellor, Kees Cook, Miguel Ojeda,
	Gustavo A. R. Silva, Joe Perches, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux,
	Josh Poimboeuf, Craig Topper, Chandler Carruth

On Wed, Jun 26, 2019 at 9:31 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Jun 25, 2019 at 11:47:06PM +0200, Thomas Gleixner wrote:
> > > On Tue, Jun 25, 2019 at 09:53:09PM +0200, Thomas Gleixner wrote:
>
> > > > but it also makes objtool unhappy:
>
> > > >  arch/x86/kernel/cpu/mtrr/generic.o: warning: objtool: get_fixed_ranges()+0x9b: unreachable instruction
>
> > I just checked two of them in the disassembly. In both cases it's jump
> > label related. Here is one:
> >
> >       asm volatile("1: rdmsr\n"
> >  410:   b9 59 02 00 00          mov    $0x259,%ecx
> >  415:   0f 32                   rdmsr
> >  417:   49 89 c6                mov    %rax,%r14
> >  41a:   48 89 d3                mov    %rdx,%rbx
> >       return EAX_EDX_VAL(val, low, high);
> >  41d:   48 c1 e3 20             shl    $0x20,%rbx
> >  421:   48 09 c3                or     %rax,%rbx
> >  424:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
> >  429:   eb 0f                   jmp    43a <get_fixed_ranges+0xaa>
> >       do_trace_read_msr(msr, val, 0);
> >  42b:   bf 59 02 00 00          mov    $0x259,%edi   <------- "unreachable"

I assume if 0x42b is unreachable, that's bad as $0x259 is never stored
in %edi before the call to get_fixed_ranges+0xaa...

> >  430:   48 89 de                mov    %rbx,%rsi
> >  433:   31 d2                   xor    %edx,%edx
> >  435:   e8 00 00 00 00          callq  43a <get_fixed_ranges+0xaa>
> >  43a:   44 89 35 00 00 00 00    mov    %r14d,0x0(%rip)        # 441 <get_fixed_ranges+0xb1>
>
> Thomas provided the actual .o file, and from that we find that the
> .rela__jump_table entries look like:
>
> 000000000010  000100000002 R_X86_64_PC32     0000000000000000 .text + 3e9
> 000000000014  000100000002 R_X86_64_PC32     0000000000000000 .text + 3f0
> 000000000018  006100000018 R_X86_64_PC64     0000000000000000 __tracepoint_read_msr + 8

I assume these relocations come from arch_static_branch() (and thus
appear in triples?)

 21 static __always_inline bool arch_static_branch(struct static_key
*key, bool branch)
 22 {
 23   asm_volatile_goto("1:"
 24     ".byte " __stringify(STATIC_KEY_INIT_NOP) "\n\t"
 25     ".pushsection __jump_table,  \"aw\" \n\t"
 26     _ASM_ALIGN "\n\t"
 27     ".long 1b - ., %l[l_yes] - . \n\t" // 1, 2
 28     _ASM_PTR "%c0 + %c1 - .\n\t" // 3
 29     ".popsection \n\t"
 30     : :  "i" (key), "i" (branch) : : l_yes);

> 000000000020  000100000002 R_X86_64_PC32     0000000000000000 .text + 424
> 000000000024  000100000002 R_X86_64_PC32     0000000000000000 .text + 3f0
> 000000000028  006100000018 R_X86_64_PC64     0000000000000000 __tracepoint_read_msr + 8
>
> From this we find that the jump target that goes with the NOP at +424 is
> +3f0, not +42b as one would expect.
>
> And as Josh noted, it is also 'weird' that this +3f0 is the very same as
> the target for the previous entry.

(Ok, I think I did talk to Josh about this, and I think he did mention
something about the jump targets, but I didn't really understand the
issue well at the time).

>
> When we compare the code at both sites, we find:
>
> 3f0:       bf 58 02 00 00          mov    $0x258,%edi
> 3f5:       48 89 de                mov    %rbx,%rsi
> 3f8:       31 d2                   xor    %edx,%edx
> 3fa:       e8 00 00 00 00          callq  3ff <get_fixed_ranges+0x6f>
>                 3fb: R_X86_64_PC32      do_trace_read_msr-0x4
>
> vs
>
> 42b:       bf 59 02 00 00          mov    $0x259,%edi
> 430:       48 89 de                mov    %rbx,%rsi
> 433:       31 d2                   xor    %edx,%edx
> 435:       e8 00 00 00 00          callq  43a <get_fixed_ranges+0xaa>
>                 436: R_X86_64_PC32      do_trace_read_msr-0x4
>
> Which is not in fact the same code.
>
> So for some reason the .rela__jump_table are buggy on this clang build.

So that sounds like a correctness bug then. (I'd been doing testing
with the STATIC_KEYS_SELFTEST, which I guess doesn't expose this).
I'm kind of surprised we can boot and pass STATIC_KEYS_SELFTEST.  Any
way you can help us pare down a test case?
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-26 22:33                         ` Nick Desaulniers
@ 2019-06-26 23:11                           ` Thomas Gleixner
  2019-06-27  7:11                           ` Peter Zijlstra
  1 sibling, 0 replies; 52+ messages in thread
From: Thomas Gleixner @ 2019-06-26 23:11 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Peter Zijlstra, Nathan Chancellor, Kees Cook, Miguel Ojeda,
	Gustavo A. R. Silva, Joe Perches, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux,
	Josh Poimboeuf, Craig Topper, Chandler Carruth

On Wed, 26 Jun 2019, Nick Desaulniers wrote:
> On Wed, Jun 26, 2019 at 9:31 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > On Tue, Jun 25, 2019 at 11:47:06PM +0200, Thomas Gleixner wrote:
> > > I just checked two of them in the disassembly. In both cases it's jump
> > > label related. Here is one:
> > >
> > >       asm volatile("1: rdmsr\n"
> > >  410:   b9 59 02 00 00          mov    $0x259,%ecx
> > >  415:   0f 32                   rdmsr
> > >  417:   49 89 c6                mov    %rax,%r14
> > >  41a:   48 89 d3                mov    %rdx,%rbx
> > >       return EAX_EDX_VAL(val, low, high);
> > >  41d:   48 c1 e3 20             shl    $0x20,%rbx
> > >  421:   48 09 c3                or     %rax,%rbx
> > >  424:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
> > >  429:   eb 0f                   jmp    43a <get_fixed_ranges+0xaa>
> > >       do_trace_read_msr(msr, val, 0);
> > >  42b:   bf 59 02 00 00          mov    $0x259,%edi   <------- "unreachable"
> 
> I assume if 0x42b is unreachable, that's bad as $0x259 is never stored
> in %edi before the call to get_fixed_ranges+0xaa...

Well no. The static key will never be enabled because it's not in the jump
table entries. And that's why objtool complains. That code path @42b will
never be reached even if the tracepoints are enabled because due to the
missing entry the kernel will not patch it.

> > So for some reason the .rela__jump_table are buggy on this clang build.
> 
> So that sounds like a correctness bug then. (I'd been doing testing
> with the STATIC_KEYS_SELFTEST, which I guess doesn't expose this).
> I'm kind of surprised we can boot and pass STATIC_KEYS_SELFTEST.  Any
> way you can help us pare down a test case?

Well, the test thing works as long as the entries which are used there are
correct. And looking at the output of that kernel build I did, I get 6
unreachable entries in 6 different files. That means that ~99% are
correct. So the chance that the self test fails is low.

Vs. test case. Just compile a kernel and pick the first file where objtool
complains. Look at the disassembly which will have the

	   nopl   0x0(%rax,%rax,1)

and that do_trace_read_msr() reference right at that failing offset (or
whatever other function is called in the file you pick).

From there you should be able to debug why the compiler is not emitting the
r.rela__jump_table entry for this particular instance.

I compiled arch/x86/kernel/cpu/mtrr/generic.o several times and the failure
is fully reproducible.

Kernel version is plain v5.2-rc6 and the config I used is here:

  https://tglx.de/~tglx/config-clang-repro

Make invocation is:

  make CC=clang HOST_CC=clang arch/x86/kernel/cpu/mtrr/generic.o

that builds only that single file and not the whole kernel Moloch.

Output:

  CC      arch/x86/kernel/cpu/mtrr/generic.o
arch/x86/kernel/cpu/mtrr/generic.o: warning: objtool: get_fixed_ranges()+0x9b: unreachable instruction

That's with the compiler I built a few hours ago with Nathans fixed
build-llvm.py script. Head commit of llvm-project is:

  master 600941e34fe: Print NULL as "(null)" in diagnostic message

Hope that helps.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-26 22:33                         ` Nick Desaulniers
  2019-06-26 23:11                           ` Thomas Gleixner
@ 2019-06-27  7:11                           ` Peter Zijlstra
  1 sibling, 0 replies; 52+ messages in thread
From: Peter Zijlstra @ 2019-06-27  7:11 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Thomas Gleixner, Nathan Chancellor, Kees Cook, Miguel Ojeda,
	Gustavo A. R. Silva, Joe Perches, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux,
	Josh Poimboeuf, Craig Topper, Chandler Carruth

On Wed, Jun 26, 2019 at 03:33:36PM -0700, Nick Desaulniers wrote:
> On Wed, Jun 26, 2019 at 9:31 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Tue, Jun 25, 2019 at 11:47:06PM +0200, Thomas Gleixner wrote:
> > > > On Tue, Jun 25, 2019 at 09:53:09PM +0200, Thomas Gleixner wrote:
> >
> > > > > but it also makes objtool unhappy:
> >
> > > > >  arch/x86/kernel/cpu/mtrr/generic.o: warning: objtool: get_fixed_ranges()+0x9b: unreachable instruction
> >
> > > I just checked two of them in the disassembly. In both cases it's jump
> > > label related. Here is one:
> > >
> > >       asm volatile("1: rdmsr\n"
> > >  410:   b9 59 02 00 00          mov    $0x259,%ecx
> > >  415:   0f 32                   rdmsr
> > >  417:   49 89 c6                mov    %rax,%r14
> > >  41a:   48 89 d3                mov    %rdx,%rbx
> > >       return EAX_EDX_VAL(val, low, high);
> > >  41d:   48 c1 e3 20             shl    $0x20,%rbx
> > >  421:   48 09 c3                or     %rax,%rbx
> > >  424:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
> > >  429:   eb 0f                   jmp    43a <get_fixed_ranges+0xaa>
> > >       do_trace_read_msr(msr, val, 0);
> > >  42b:   bf 59 02 00 00          mov    $0x259,%edi   <------- "unreachable"
> 
> I assume if 0x42b is unreachable, that's bad as $0x259 is never stored
> in %edi before the call to get_fixed_ranges+0xaa...

So what happens is that the __jump_table entry for 424 is wrong. When we
enable that key (read msr tracepoint) the code will jump to another
instance of the read msr tracepoint and continue running from there.

So we'll jump from one inlined instance to another, with all the
ramifications thereof; the code-flow will be completely screwy.

> > >  430:   48 89 de                mov    %rbx,%rsi
> > >  433:   31 d2                   xor    %edx,%edx
> > >  435:   e8 00 00 00 00          callq  43a <get_fixed_ranges+0xaa>
> > >  43a:   44 89 35 00 00 00 00    mov    %r14d,0x0(%rip)        # 441 <get_fixed_ranges+0xb1>
> >
> > Thomas provided the actual .o file, and from that we find that the
> > .rela__jump_table entries look like:
> >
> > 000000000010  000100000002 R_X86_64_PC32     0000000000000000 .text + 3e9
> > 000000000014  000100000002 R_X86_64_PC32     0000000000000000 .text + 3f0
> > 000000000018  006100000018 R_X86_64_PC64     0000000000000000 __tracepoint_read_msr + 8
> 
> I assume these relocations come from arch_static_branch() (and thus
> appear in triples?)
> 
>  21 static __always_inline bool arch_static_branch(struct static_key
> *key, bool branch)
>  22 {
>  23   asm_volatile_goto("1:"
>  24     ".byte " __stringify(STATIC_KEY_INIT_NOP) "\n\t"

>  25     ".pushsection __jump_table,  \"aw\" \n\t"
>  26     _ASM_ALIGN "\n\t"
>  27     ".long 1b - ., %l[l_yes] - . \n\t" // 1, 2
>  28     _ASM_PTR "%c0 + %c1 - .\n\t" // 3
>  29     ".popsection \n\t"

Yes, its lines 25-29 ^ that generate the jump_table entries. The first
entry is the code location, the second is the jump target and the third
is a pointer (and two LSB state bits) to the key this belongs to.

The compiler emits .rela objects for these and this is what we use with
objtool -- just like a linker would to resolve and create the actual
__jump_table section.

>  30     : :  "i" (key), "i" (branch) : : l_yes);
> 
> > 000000000020  000100000002 R_X86_64_PC32     0000000000000000 .text + 424
> > 000000000024  000100000002 R_X86_64_PC32     0000000000000000 .text + 3f0
> > 000000000028  006100000018 R_X86_64_PC64     0000000000000000 __tracepoint_read_msr + 8
> >
> > From this we find that the jump target that goes with the NOP at +424 is
> > +3f0, not +42b as one would expect.
> >
> > And as Josh noted, it is also 'weird' that this +3f0 is the very same as
> > the target for the previous entry.
> 
> (Ok, I think I did talk to Josh about this, and I think he did mention
> something about the jump targets, but I didn't really understand the
> issue well at the time).

So what we have here is two instances of the same read msr inline. They
have code in different offsets (+3e9 and +424 resp.) but somehow the
compiler messed up and collapsed their jump target (or didn't properly
de-duplicate -- I've no idea how inline instantiation actually works).

> >
> > When we compare the code at both sites, we find:
> >
> > 3f0:       bf 58 02 00 00          mov    $0x258,%edi
> > 3f5:       48 89 de                mov    %rbx,%rsi
> > 3f8:       31 d2                   xor    %edx,%edx
> > 3fa:       e8 00 00 00 00          callq  3ff <get_fixed_ranges+0x6f>
> >                 3fb: R_X86_64_PC32      do_trace_read_msr-0x4
> >
> > vs
> >
> > 42b:       bf 59 02 00 00          mov    $0x259,%edi
> > 430:       48 89 de                mov    %rbx,%rsi
> > 433:       31 d2                   xor    %edx,%edx
> > 435:       e8 00 00 00 00          callq  43a <get_fixed_ranges+0xaa>
> >                 436: R_X86_64_PC32      do_trace_read_msr-0x4
> >
> > Which is not in fact the same code.
> >
> > So for some reason the .rela__jump_table are buggy on this clang build.
> 
> So that sounds like a correctness bug then. 

Yes, this is very very very bad. Like I wrote above, this results in
code flow moving from one inline'd instance into another, it completely
wrecks code flow integrity.

> (I'd been doing testing
> with the STATIC_KEYS_SELFTEST, which I guess doesn't expose this).
> I'm kind of surprised we can boot and pass STATIC_KEYS_SELFTEST.  Any
> way you can help us pare down a test case?

It looks like an inlining bug, and I a rare one at that. The static key
self-tests simply don't trigger this for whatever reason. Like Thomas
wrote, of all the jump_labels in the kernel, only 6 go 'funny' for
whatever reason.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-26 22:14                   ` Nick Desaulniers
@ 2019-06-27  7:12                     ` Peter Zijlstra
  2019-06-28 13:31                       ` Peter Zijlstra
  0 siblings, 1 reply; 52+ messages in thread
From: Peter Zijlstra @ 2019-06-27  7:12 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Gustavo A. R. Silva, Joe Perches, Miguel Ojeda, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Kees Cook, Shawn Landden,
	Nathan Chancellor, Luc Van Oostenryck, Chandler Carruth

On Wed, Jun 26, 2019 at 03:14:05PM -0700, Nick Desaulniers wrote:
> On Wed, Jun 26, 2019 at 1:49 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Tue, Jun 25, 2019 at 11:15:57AM -0700, Nick Desaulniers wrote:
> >
> > > Unreleased versions of Clang built from source can;
> >
> > I've bad experiences with using unreleased compilers; life is too short.
> 
> Yes; but before release is when they need the help the most in order
> for testing to find regressions.
> 
> >
> > > We're currently planning multiple output constraint support w/ asm
> > > goto, and have recently implemented things like
> > > __GCC_ASM_FLAG_OUTPUTS__.
> >
> > That's good to hear.
> >
> > > If there's other features that we should
> > > start implementing, please let us know.
> >
> > If you've got any ideas on how to make this:
> >
> >   https://lkml.kernel.org/r/20190621120923.GT3463@hirez.programming.kicks-ass.net
> >
> > work, that'd be nice. Basically I wanted the asm goto to emit a 2 or 5
> > byte JMP/NOP depending on the displacement size. We can trivially get
> > JMP right by using:
> >
> >         jmp \l_yes
> >
> > and letting the assembler sort it, but getting the NOP right has so far
> > eluded me:
> >
> > .if \l_yes - (. + 2) < 127
> >         .byte 0x66, 0x90
> > .else
> >         .byte STATIC_KEY_INIT_NOP
> > .endif
> >
> > doesn't work. We can ofcourse unconditionally emit the JMP and then
> > rewrite the binary afterward, and replace the emitted jumps with the
> > right size NOP, but that's a bit yuck.
> >
> > Once it emits the variable size instruction consistently, we can update
> > the patching side to use the same condition to select the new
> > instruction (and fix objtool).
> 
> Not sure; the assembler directives and their requirements aren't
> something I'm too familiar with.

Josh came up with the following:

+		/* If the jump target is close, do a 2-byte nop: */
+		".skip -(%l[l_yes] - 1b <= 126), 0x66\n"
+		".skip -(%l[l_yes] - 1b <= 126), 0x90\n"
+		/* Otherwise do a 5-byte nop: */
+		".skip -(%l[l_yes] - 1b > 126), 0x0f\n"
+		".skip -(%l[l_yes] - 1b > 126), 0x1f\n"
+		".skip -(%l[l_yes] - 1b > 126), 0x44\n"
+		".skip -(%l[l_yes] - 1b > 126), 0x00\n"
+		".skip -(%l[l_yes] - 1b > 126), 0x00\n"

Which is a wonderfully gruesome hack :-) So I'll be playing with that
for a bit.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-26 22:15                         ` Nick Desaulniers
@ 2019-06-27  7:16                           ` Peter Zijlstra
  0 siblings, 0 replies; 52+ messages in thread
From: Peter Zijlstra @ 2019-06-27  7:16 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Thomas Gleixner, Nathan Chancellor, Kees Cook, Miguel Ojeda,
	Gustavo A. R. Silva, Joe Perches, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux,
	Josh Poimboeuf

On Wed, Jun 26, 2019 at 03:15:38PM -0700, Nick Desaulniers wrote:
> On Wed, Jun 26, 2019 at 2:24 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Tue, Jun 25, 2019 at 11:47:06PM +0200, Thomas Gleixner wrote:
> > > > On Tue, Jun 25, 2019 at 09:53:09PM +0200, Thomas Gleixner wrote:
> >
> > > > > but it also makes objtool unhappy:
> > > > >
> > > > >  arch/x86/events/intel/core.o: warning: objtool: intel_pmu_nhm_workaround()+0xb3: unreachable instruction
> > > > >  kernel/fork.o: warning: objtool: free_thread_stack()+0x126: unreachable instruction
> > > > >  mm/workingset.o: warning: objtool: count_shadow_nodes()+0x11f: unreachable instruction
> > > > >  arch/x86/kernel/cpu/mtrr/generic.o: warning: objtool: get_fixed_ranges()+0x9b: unreachable instruction
> > > > >  arch/x86/kernel/platform-quirks.o: warning: objtool: x86_early_init_platform_quirks()+0x84: unreachable instruction
> > > > >  drivers/iommu/irq_remapping.o: warning: objtool: irq_remap_enable_fault_handling()+0x1d: unreachable instruction
> >
> > > I just checked two of them in the disassembly. In both cases it's jump
> > > label related. Here is one:
> > >
> > >       asm volatile("1: rdmsr\n"
> > >  410:   b9 59 02 00 00          mov    $0x259,%ecx
> > >  415:   0f 32                   rdmsr
> > >  417:   49 89 c6                mov    %rax,%r14
> > >  41a:   48 89 d3                mov    %rdx,%rbx
> > >       return EAX_EDX_VAL(val, low, high);
> > >  41d:   48 c1 e3 20             shl    $0x20,%rbx
> > >  421:   48 09 c3                or     %rax,%rbx
> > >  424:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
> > >  429:   eb 0f                   jmp    43a <get_fixed_ranges+0xaa>
> > >       do_trace_read_msr(msr, val, 0);
> > >  42b:   bf 59 02 00 00          mov    $0x259,%edi   <------- "unreachable"
> > >  430:   48 89 de                mov    %rbx,%rsi
> > >  433:   31 d2                   xor    %edx,%edx
> > >  435:   e8 00 00 00 00          callq  43a <get_fixed_ranges+0xaa>
> > >  43a:   44 89 35 00 00 00 00    mov    %r14d,0x0(%rip)        # 441 <get_fixed_ranges+0xb1>
> > >
> > > Interestingly enough there are some more hunks of the same pattern in that
> > > function which look all the same. Those are not upsetting objtool. Josh
> > > might give an hint where to stare at.
> >
> > That's pretty atrocious code-gen :/ Does LLVM support things like label
> > attributes? Back when we did jump labels GCC didn't, or rather, it
> > ignored it completely when combined with asm goto (and it might still).
> >
> > That is, would something like this:
> >
> > diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
> > index 06c3cc22a058..1761b1e76ddc 100644
> > --- a/arch/x86/include/asm/jump_label.h
> > +++ b/arch/x86/include/asm/jump_label.h
> > @@ -32,7 +32,7 @@ static __always_inline bool arch_static_branch(struct static_key *key, bool bran
> >                 : :  "i" (key), "i" (branch) : : l_yes);
> >
> >         return false;
> > -l_yes:
> > +l_yes: __attribute__((cold));
> >         return true;
> >  }
> >
> > @@ -49,7 +49,7 @@ static __always_inline bool arch_static_branch_jump(struct static_key *key, bool
> >                 : :  "i" (key), "i" (branch) : : l_yes);
> >
> >         return false;
> > -l_yes:
> > +l_yes: __attribute__((hot));
> >         return true;
> >  }
> >
> > Help LLVM?

As I wrote later; the above suggestion is actually wrong :/

> So Clang definitely complains about putting attribute hot/cold on
> labels: https://godbolt.org/z/N-Z33Q
> In my test case I wasn't able to influence code gen with them though
> in GCC at -O2 or -O0.  Maybe GCC has a test case that shows how they
> should work?

As I wrote in that same later email; the way we influence the actual
code-layout is with the __builtin_expect() thing. Let me expand on that
in another email.

Sadly, I've no clue what so ever about compiler internals, be it GCC or
LLVM.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-26 22:23                           ` Nick Desaulniers
@ 2019-06-27  7:35                             ` Peter Zijlstra
  0 siblings, 0 replies; 52+ messages in thread
From: Peter Zijlstra @ 2019-06-27  7:35 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Thomas Gleixner, Nathan Chancellor, Kees Cook, Miguel Ojeda,
	Gustavo A. R. Silva, Joe Perches, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Shawn Landden, clang-built-linux,
	Josh Poimboeuf

On Wed, Jun 26, 2019 at 03:23:24PM -0700, Nick Desaulniers wrote:
> On Wed, Jun 26, 2019 at 2:55 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Wed, Jun 26, 2019 at 11:24:32AM +0200, Peter Zijlstra wrote:
> > > That's pretty atrocious code-gen :/
> >
> > And I know nobody reads comments (I don't either), but I did write one
> > on this as it happens.
> 
> I've definitely read that block in include/linux/jump_label.h; can't
> say I fully understand it yet, but appreciate patience and
> explanations.

So the relevant bits are:

 * type\branch| likely (1)            | unlikely (0)
 * -----------+-----------------------+------------------
 *            |                       |
 *  true (1)  |    ...                |    ...
 *            |    NOP                |    JMP L
 *            |    <br-stmts>         | 1: ...
 *            | L: ...                |
 *            |                       |
 *            |                       | L: <br-stmts>
 *            |                       |    jmp 1b
 *            |                       |
 * -----------+-----------------------+------------------
 *            |                       |
 *  false (0) |    ...                |    ...
 *            |    JMP L              |    NOP
 *            |    <br-stmts>         | 1: ...
 *            | L: ...                |
 *            |                       |
 *            |                       | L: <br-stmts>
 *            |                       |    jmp 1b
 *            |                       |
 * -----------+-----------------------+------------------

So we have two types, static_key_true, which defaults to true and
static_key_false, which defaults (unsurprisingly) to false. At runtime
they can be switched at will, it is just the initial state which
determines what code we actually need to emit at compile time.

And we have two statements: static_branch_likely(), the branch is likely
-- or we want the block in-line, and static_branch_unlikely(), the
branch is unlikely -- or we want the block out-of-line.

This is coded like:

#define static_branch_likely(x)							\
({										\
	bool branch;								\
	if (__builtin_types_compatible_p(typeof(*x), struct static_key_true))	\
		branch = !arch_static_branch(&(x)->key, true);			\
	else if (__builtin_types_compatible_p(typeof(*x), struct static_key_false)) \
		branch = !arch_static_branch_jump(&(x)->key, true);		\
	else									\
		branch = ____wrong_branch_error();				\
	likely(branch);								\
})

#define static_branch_unlikely(x)						\
({										\
	bool branch;								\
	if (__builtin_types_compatible_p(typeof(*x), struct static_key_true))	\
		branch = arch_static_branch_jump(&(x)->key, false);		\
	else if (__builtin_types_compatible_p(typeof(*x), struct static_key_false)) \
		branch = arch_static_branch(&(x)->key, false);			\
	else									\
		branch = ____wrong_branch_error();				\
	unlikely(branch);							\
})

Let's walk through static_branch_unlikely() (the other is very similar,
just reversed).

We use __builtin_types_compatible_p() to compile-time detect which key
type is used, such that we can emit the right initial code:

  - static_key_true; we must emit a JMP to the block,
  - static_key_false; we must emit a NOP and not execute the block.
  - neither; we generate a link error.

Then we take the return value and use __builtin_expect(, 0) on it to
influence the block layout, specifically we want the block to be
out-of-line.

It appears the __builtin_expect() usage isn't working right with LLVM
resuling in that layout issue Thomas spotted. GCC8+ can even place them
in the .text.unlikely section as func.cold.N parts/symbols. But the main
point is to get the block away from the normal I$ content to minimize
impact.



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-27  7:12                     ` Peter Zijlstra
@ 2019-06-28 13:31                       ` Peter Zijlstra
  2019-06-28 18:44                         ` Nick Desaulniers
  0 siblings, 1 reply; 52+ messages in thread
From: Peter Zijlstra @ 2019-06-28 13:31 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Gustavo A. R. Silva, Joe Perches, Miguel Ojeda, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Kees Cook, Shawn Landden,
	Nathan Chancellor, Luc Van Oostenryck, Chandler Carruth

On Thu, Jun 27, 2019 at 09:12:50AM +0200, Peter Zijlstra wrote:

> Josh came up with the following:
> 
> +		/* If the jump target is close, do a 2-byte nop: */
> +		".skip -(%l[l_yes] - 1b <= 126), 0x66\n"
> +		".skip -(%l[l_yes] - 1b <= 126), 0x90\n"
> +		/* Otherwise do a 5-byte nop: */
> +		".skip -(%l[l_yes] - 1b > 126), 0x0f\n"
> +		".skip -(%l[l_yes] - 1b > 126), 0x1f\n"
> +		".skip -(%l[l_yes] - 1b > 126), 0x44\n"
> +		".skip -(%l[l_yes] - 1b > 126), 0x00\n"
> +		".skip -(%l[l_yes] - 1b > 126), 0x00\n"
> 
> Which is a wonderfully gruesome hack :-) So I'll be playing with that
> for a bit.

For those with interest; full patches at:

  https://lkml.kernel.org/r/20190628102113.360432762@infradead.org

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-28 13:31                       ` Peter Zijlstra
@ 2019-06-28 18:44                         ` Nick Desaulniers
  2019-06-29  7:10                           ` Peter Zijlstra
  0 siblings, 1 reply; 52+ messages in thread
From: Nick Desaulniers @ 2019-06-28 18:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Gustavo A. R. Silva, Joe Perches, Miguel Ojeda, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Kees Cook, Shawn Landden,
	Nathan Chancellor, Luc Van Oostenryck, Chandler Carruth,
	Jann Horn, Bill Wendling, Alexander Potapenko

On Fri, Jun 28, 2019 at 6:31 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Jun 27, 2019 at 09:12:50AM +0200, Peter Zijlstra wrote:
>
> > Josh came up with the following:
> >
> > +             /* If the jump target is close, do a 2-byte nop: */
> > +             ".skip -(%l[l_yes] - 1b <= 126), 0x66\n"
> > +             ".skip -(%l[l_yes] - 1b <= 126), 0x90\n"
> > +             /* Otherwise do a 5-byte nop: */
> > +             ".skip -(%l[l_yes] - 1b > 126), 0x0f\n"
> > +             ".skip -(%l[l_yes] - 1b > 126), 0x1f\n"
> > +             ".skip -(%l[l_yes] - 1b > 126), 0x44\n"
> > +             ".skip -(%l[l_yes] - 1b > 126), 0x00\n"
> > +             ".skip -(%l[l_yes] - 1b > 126), 0x00\n"
> >
> > Which is a wonderfully gruesome hack :-) So I'll be playing with that
> > for a bit.
>
> For those with interest; full patches at:
>
>   https://lkml.kernel.org/r/20190628102113.360432762@infradead.org

Do you have a branch pushed that I can pull this from to quickly test w/ Clang?

The .skip trick is wild; I don't quite understand the negation in the
above or patch 8/8 for is_byte/is_long.

Also, the comment on 8/8 about patching early hits home; we had a
sign-extending-booleans bug that was causing the address calculation
to be off by two.  Jann and Bill had to help me debug that one, and
funnily enough Kees fixed it in LLVM.  Fetching exception frames out
of early_idt_handler_common has been my most memorable kernel
debugging experience to date, and hope I don't have to do that ever
again.  Kees this week adjusted where arm64 does static_key enablement
(moved it earlier for Alexander Potapenko's slab
initialization/poisoning set).

For the wrong __jump_table entry; I consider that a critical issue we
need to fix before the clang-9 release.  I'm unloading my current
responsibilities at work to be able to sit and focus on bug.  I'll
probably start a new thread with you, tglx, Josh, and our mailing list
next week (sorry for co-opting this thread).  I have been using
creduce quite successfully for finding and fixing our previous codegen
bugs (https://nickdesaulniers.github.io/blog/2019/01/18/finding-compiler-bugs-with-c-reduce/),
but I need to sit and understand the precise failure more in order to
reduce the input.  We can see pretty well where in the compilation
pipeline things go wrong; I just find it hard to page through large
inputs such as whole translation units.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-28 18:44                         ` Nick Desaulniers
@ 2019-06-29  7:10                           ` Peter Zijlstra
  0 siblings, 0 replies; 52+ messages in thread
From: Peter Zijlstra @ 2019-06-29  7:10 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Gustavo A. R. Silva, Joe Perches, Miguel Ojeda, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, H. Peter Anvin,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Kan Liang, linux-kernel, Kees Cook, Shawn Landden,
	Nathan Chancellor, Luc Van Oostenryck, Chandler Carruth,
	Jann Horn, Bill Wendling, Alexander Potapenko

On Fri, Jun 28, 2019 at 11:44:16AM -0700, Nick Desaulniers wrote:
> On Fri, Jun 28, 2019 at 6:31 AM Peter Zijlstra <peterz@infradead.org> wrote:

> > For those with interest; full patches at:
> >
> >   https://lkml.kernel.org/r/20190628102113.360432762@infradead.org
> 
> Do you have a branch pushed that I can pull this from to quickly test w/ Clang?

I've not yet pushied it out, will do on Monday or so.

> The .skip trick is wild; I don't quite understand the negation in the
> above or patch 8/8 for is_byte/is_long.

Yes, that's a bit magic. What happens is that GAS has:

 false := 0
 true := ~false

(it lacks a boolean not and uses a bitwise negate instead). Therefore,
the result of an (true) compare is all-1-s (or -1), and since we want a
single .skip we have to negate.

The actual condition for the result is more complicated than it should
be, but that only came to me after sendind out these patches, basically
it should be:

  (val >> 31) == (val >> 7)

to test if a s32 van be represented in a s8.

> For the wrong __jump_table entry; I consider that a critical issue we
> need to fix before the clang-9 release.  I'm unloading my current
> responsibilities at work to be able to sit and focus on bug.  I'll
> probably start a new thread with you, tglx, Josh, and our mailing list
> next week (sorry for co-opting this thread).  I have been using
> creduce quite successfully for finding and fixing our previous codegen
> bugs (https://nickdesaulniers.github.io/blog/2019/01/18/finding-compiler-bugs-with-c-reduce/),
> but I need to sit and understand the precise failure more in order to
> reduce the input.  We can see pretty well where in the compilation
> pipeline things go wrong; I just find it hard to page through large
> inputs such as whole translation units.

Sure; add me to the thread and I'll be glad to answer anything I can.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [tip:perf/urgent] perf/x86/intel: Mark expected switch fall-throughs
  2019-06-24 16:19 [PATCH] perf/x86/intel: Mark expected switch fall-throughs Gustavo A. R. Silva
  2019-06-24 19:31 ` Peter Zijlstra
@ 2019-07-25 16:27 ` tip-bot for Gustavo A. R. Silva
  2019-07-25 17:06   ` Borislav Petkov
  1 sibling, 1 reply; 52+ messages in thread
From: tip-bot for Gustavo A. R. Silva @ 2019-07-25 16:27 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, kan.liang, namhyung, jolsa, peterz,
	alexander.shishkin, keescook, hpa, bp, tglx, torvalds, acme,
	mingo, gustavo

Commit-ID:  289a2d22b5b611d85030795802a710e9f520df29
Gitweb:     https://git.kernel.org/tip/289a2d22b5b611d85030795802a710e9f520df29
Author:     Gustavo A. R. Silva <gustavo@embeddedor.com>
AuthorDate: Mon, 24 Jun 2019 11:19:13 -0500
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 25 Jul 2019 15:57:03 +0200

perf/x86/intel: Mark expected switch fall-throughs

In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.

This patch fixes the following warnings:

  arch/x86/events/intel/core.c: In function ‘intel_pmu_init’:
  arch/x86/events/intel/core.c:4959:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
  arch/x86/events/intel/core.c:5008:8: warning: this statement may fall through [-Wimplicit-fallthrough=]

Warning level 3 was used: -Wimplicit-fallthrough=3

This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190624161913.GA32270@embeddedor
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index c9075fc75cb6..648260b5f367 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4954,6 +4954,7 @@ __init int intel_pmu_init(void)
 
 	case INTEL_FAM6_SKYLAKE_X:
 		pmem = true;
+		/* fall through */
 	case INTEL_FAM6_SKYLAKE_MOBILE:
 	case INTEL_FAM6_SKYLAKE_DESKTOP:
 	case INTEL_FAM6_KABYLAKE_MOBILE:
@@ -5003,6 +5004,7 @@ __init int intel_pmu_init(void)
 	case INTEL_FAM6_ICELAKE_X:
 	case INTEL_FAM6_ICELAKE_XEON_D:
 		pmem = true;
+		/* fall through */
 	case INTEL_FAM6_ICELAKE_MOBILE:
 	case INTEL_FAM6_ICELAKE_DESKTOP:
 		x86_pmu.late_ack = true;

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [tip:perf/urgent] perf/x86/intel: Mark expected switch fall-throughs
  2019-07-25 16:27 ` [tip:perf/urgent] " tip-bot for Gustavo A. R. Silva
@ 2019-07-25 17:06   ` Borislav Petkov
  2019-07-25 17:35     ` Peter Zijlstra
  0 siblings, 1 reply; 52+ messages in thread
From: Borislav Petkov @ 2019-07-25 17:06 UTC (permalink / raw)
  To: hpa, tglx, mingo, gustavo, torvalds, acme, kan.liang, namhyung,
	jolsa, linux-kernel, alexander.shishkin, keescook, peterz
  Cc: linux-tip-commits

On Thu, Jul 25, 2019 at 09:27:10AM -0700, tip-bot for Gustavo A. R. Silva wrote:
> Commit-ID:  289a2d22b5b611d85030795802a710e9f520df29
> Gitweb:     https://git.kernel.org/tip/289a2d22b5b611d85030795802a710e9f520df29
> Author:     Gustavo A. R. Silva <gustavo@embeddedor.com>
> AuthorDate: Mon, 24 Jun 2019 11:19:13 -0500
> Committer:  Ingo Molnar <mingo@kernel.org>
> CommitDate: Thu, 25 Jul 2019 15:57:03 +0200
> 
> perf/x86/intel: Mark expected switch fall-throughs
> 
> In preparation to enabling -Wimplicit-fallthrough, mark switch
> cases where we are expecting to fall through.
> 
> This patch fixes the following warnings:

"This patch"

> 
>   arch/x86/events/intel/core.c: In function ‘intel_pmu_init’:
>   arch/x86/events/intel/core.c:4959:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
>   arch/x86/events/intel/core.c:5008:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
> 
> Warning level 3 was used: -Wimplicit-fallthrough=3
> 
> This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough.

Another "This patch"

I think it is clear about which patch the commit message is talking
about, without stating it explicitly.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [tip:perf/urgent] perf/x86/intel: Mark expected switch fall-throughs
  2019-07-25 17:06   ` Borislav Petkov
@ 2019-07-25 17:35     ` Peter Zijlstra
  2019-07-25 23:18       ` Joe Perches
  0 siblings, 1 reply; 52+ messages in thread
From: Peter Zijlstra @ 2019-07-25 17:35 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: hpa, tglx, mingo, gustavo, torvalds, acme, kan.liang, namhyung,
	jolsa, linux-kernel, alexander.shishkin, keescook,
	linux-tip-commits

On Thu, Jul 25, 2019 at 07:06:13PM +0200, Borislav Petkov wrote:
> On Thu, Jul 25, 2019 at 09:27:10AM -0700, tip-bot for Gustavo A. R. Silva wrote:
> > Commit-ID:  289a2d22b5b611d85030795802a710e9f520df29
> > Gitweb:     https://git.kernel.org/tip/289a2d22b5b611d85030795802a710e9f520df29
> > Author:     Gustavo A. R. Silva <gustavo@embeddedor.com>
> > AuthorDate: Mon, 24 Jun 2019 11:19:13 -0500
> > Committer:  Ingo Molnar <mingo@kernel.org>
> > CommitDate: Thu, 25 Jul 2019 15:57:03 +0200
> > 
> > perf/x86/intel: Mark expected switch fall-throughs
> > 
> > In preparation to enabling -Wimplicit-fallthrough, mark switch
> > cases where we are expecting to fall through.
> > 
> > This patch fixes the following warnings:
> 
> "This patch"
> 
> > 
> >   arch/x86/events/intel/core.c: In function ‘intel_pmu_init’:
> >   arch/x86/events/intel/core.c:4959:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
> >   arch/x86/events/intel/core.c:5008:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
> > 
> > Warning level 3 was used: -Wimplicit-fallthrough=3
> > 
> > This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough.
> 
> Another "This patch"
> 
> I think it is clear about which patch the commit message is talking
> about, without stating it explicitly.

It fits with the whole atrocious 'comments are significant' premise that
the Changelog is atrocious too :-)

/me runs for cover.

Seriously though; I detest these patches and we really, as in _really_
should have done that attribute thing.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [tip:perf/urgent] perf/x86/intel: Mark expected switch fall-throughs
  2019-07-25 17:35     ` Peter Zijlstra
@ 2019-07-25 23:18       ` Joe Perches
  2019-07-25 23:28         ` Kees Cook
  0 siblings, 1 reply; 52+ messages in thread
From: Joe Perches @ 2019-07-25 23:18 UTC (permalink / raw)
  To: Peter Zijlstra, Borislav Petkov
  Cc: hpa, tglx, mingo, gustavo, torvalds, acme, kan.liang, namhyung,
	jolsa, linux-kernel, alexander.shishkin, keescook,
	linux-tip-commits

On Thu, 2019-07-25 at 19:35 +0200, Peter Zijlstra wrote:
> Seriously though; I detest these patches and we really, as in _really_
> should have done that attribute thing.

At least it'll be fairly easy to convert to something
sensible later.

Variants of the equivalent of:

s@/* fallthrough */@fallthrough;@

with some trivial whitespace reformatting will read
_much_ better.

It's pretty scriptable.



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [tip:perf/urgent] perf/x86/intel: Mark expected switch fall-throughs
  2019-07-25 23:18       ` Joe Perches
@ 2019-07-25 23:28         ` Kees Cook
  0 siblings, 0 replies; 52+ messages in thread
From: Kees Cook @ 2019-07-25 23:28 UTC (permalink / raw)
  To: Joe Perches
  Cc: Peter Zijlstra, Borislav Petkov, hpa, tglx, mingo, gustavo,
	torvalds, acme, kan.liang, namhyung, jolsa, linux-kernel,
	alexander.shishkin, linux-tip-commits

On Thu, Jul 25, 2019 at 04:18:56PM -0700, Joe Perches wrote:
> On Thu, 2019-07-25 at 19:35 +0200, Peter Zijlstra wrote:
> > Seriously though; I detest these patches and we really, as in _really_
> > should have done that attribute thing.
> 
> At least it'll be fairly easy to convert to something
> sensible later.
> 
> Variants of the equivalent of:
> 
> s@/* fallthrough */@fallthrough;@
> 
> with some trivial whitespace reformatting will read
> _much_ better.
> 
> It's pretty scriptable.

Yup; that's been my perspective. First let's finish markings (so so
close right now), then we can do the next phase.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2019-07-25 23:28 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-24 16:19 [PATCH] perf/x86/intel: Mark expected switch fall-throughs Gustavo A. R. Silva
2019-06-24 19:31 ` Peter Zijlstra
2019-06-24 19:45   ` Joe Perches
2019-06-24 20:37     ` Peter Zijlstra
2019-06-24 20:53       ` Gustavo A. R. Silva
2019-06-24 20:57         ` Joe Perches
2019-06-25  7:20           ` Peter Zijlstra
2019-06-24 22:28         ` Miguel Ojeda
2019-06-25  7:18           ` Peter Zijlstra
2019-06-25 12:47             ` Miguel Ojeda
2019-06-25 18:15               ` Nick Desaulniers
2019-06-25 22:29                 ` Joe Perches
2019-06-25 22:57                   ` Nick Desaulniers
2019-06-25 23:25                     ` Joe Perches
2019-06-26  8:49                 ` Peter Zijlstra
2019-06-26 22:14                   ` Nick Desaulniers
2019-06-27  7:12                     ` Peter Zijlstra
2019-06-28 13:31                       ` Peter Zijlstra
2019-06-28 18:44                         ` Nick Desaulniers
2019-06-29  7:10                           ` Peter Zijlstra
2019-06-25 17:12             ` Kees Cook
2019-06-25 18:05               ` Nathan Chancellor
2019-06-25 19:53                 ` Thomas Gleixner
2019-06-25 20:27                   ` Nathan Chancellor
2019-06-25 20:37                     ` Nick Desaulniers
2019-06-25 21:47                     ` Thomas Gleixner
2019-06-26  5:10                       ` Nathan Chancellor
2019-06-26 15:18                         ` Thomas Gleixner
2019-06-26 19:00                           ` Nathan Chancellor
2019-06-26 19:46                             ` Thomas Gleixner
2019-06-26 20:03                               ` Nathan Chancellor
2019-06-26  9:24                       ` Peter Zijlstra
2019-06-26  9:55                         ` Peter Zijlstra
2019-06-26 22:23                           ` Nick Desaulniers
2019-06-27  7:35                             ` Peter Zijlstra
2019-06-26 10:43                         ` Peter Zijlstra
2019-06-26 22:15                         ` Nick Desaulniers
2019-06-27  7:16                           ` Peter Zijlstra
2019-06-26 16:30                       ` Peter Zijlstra
2019-06-26 22:33                         ` Nick Desaulniers
2019-06-26 23:11                           ` Thomas Gleixner
2019-06-27  7:11                           ` Peter Zijlstra
2019-06-25 23:46                     ` Arnaldo Carvalho de Melo
2019-06-26  5:14                       ` Nathan Chancellor
2019-06-25 20:09                 ` Kees Cook
2019-06-26  8:06               ` Peter Zijlstra
2019-06-25  7:15         ` Peter Zijlstra
2019-07-25 16:27 ` [tip:perf/urgent] " tip-bot for Gustavo A. R. Silva
2019-07-25 17:06   ` Borislav Petkov
2019-07-25 17:35     ` Peter Zijlstra
2019-07-25 23:18       ` Joe Perches
2019-07-25 23:28         ` Kees Cook

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.