* [PATCH] x86: mark some mpspec inline functions as __init @ 2021-02-25 11:22 Arnd Bergmann 2021-02-25 11:45 ` Borislav Petkov 0 siblings, 1 reply; 13+ messages in thread From: Arnd Bergmann @ 2021-02-25 11:22 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86, Nathan Chancellor, Nick Desaulniers Cc: Arnd Bergmann, H. Peter Anvin, linux-kernel, clang-built-linux From: Arnd Bergmann <arnd@arndb.de> clang-13 sometimes decides to not inline early_get_smp_config(), which leads to a link-time warning: WARNING: modpost: vmlinux.o(.text+0x838cc): Section mismatch in reference from the function early_get_smp_config() to the variable .init.data:x86_init The function early_get_smp_config() references the variable __initdata x86_init. This is often because early_get_smp_config lacks a __initdata annotation or the annotation of x86_init is wrong. There are two other functions which may run into the same issue, so mark all three as __init. Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- arch/x86/include/asm/mpspec.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/mpspec.h b/arch/x86/include/asm/mpspec.h index e90ac7e9ae2c..b41066dbf5c2 100644 --- a/arch/x86/include/asm/mpspec.h +++ b/arch/x86/include/asm/mpspec.h @@ -49,17 +49,17 @@ extern int smp_found_config; # define smp_found_config 0 #endif -static inline void get_smp_config(void) +static inline __init void get_smp_config(void) { x86_init.mpparse.get_smp_config(0); } -static inline void early_get_smp_config(void) +static inline __init void early_get_smp_config(void) { x86_init.mpparse.get_smp_config(1); } -static inline void find_smp_config(void) +static inline __init void find_smp_config(void) { x86_init.mpparse.find_smp_config(); } -- 2.29.2 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: mark some mpspec inline functions as __init 2021-02-25 11:22 [PATCH] x86: mark some mpspec inline functions as __init Arnd Bergmann @ 2021-02-25 11:45 ` Borislav Petkov 2021-02-25 12:18 ` Arnd Bergmann 0 siblings, 1 reply; 13+ messages in thread From: Borislav Petkov @ 2021-02-25 11:45 UTC (permalink / raw) To: Arnd Bergmann Cc: Thomas Gleixner, Ingo Molnar, x86, Nathan Chancellor, Nick Desaulniers, Arnd Bergmann, H. Peter Anvin, linux-kernel, clang-built-linux On Thu, Feb 25, 2021 at 12:22:41PM +0100, Arnd Bergmann wrote: > From: Arnd Bergmann <arnd@arndb.de> > > clang-13 sometimes decides to not inline early_get_smp_config(), > which leads to a link-time warning: > > WARNING: modpost: vmlinux.o(.text+0x838cc): Section mismatch in reference from the function early_get_smp_config() to the variable .init.data:x86_init > The function early_get_smp_config() references > the variable __initdata x86_init. > This is often because early_get_smp_config lacks a __initdata > annotation or the annotation of x86_init is wrong. > > There are two other functions which may run into the same issue, > so mark all three as __init. > > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > arch/x86/include/asm/mpspec.h | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/include/asm/mpspec.h b/arch/x86/include/asm/mpspec.h > index e90ac7e9ae2c..b41066dbf5c2 100644 > --- a/arch/x86/include/asm/mpspec.h > +++ b/arch/x86/include/asm/mpspec.h > @@ -49,17 +49,17 @@ extern int smp_found_config; > # define smp_found_config 0 > #endif > > -static inline void get_smp_config(void) > +static inline __init void get_smp_config(void) __always_inline then I guess. Not inlining those is just silly. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: mark some mpspec inline functions as __init 2021-02-25 11:45 ` Borislav Petkov @ 2021-02-25 12:18 ` Arnd Bergmann 2021-02-25 12:42 ` Borislav Petkov 0 siblings, 1 reply; 13+ messages in thread From: Arnd Bergmann @ 2021-02-25 12:18 UTC (permalink / raw) To: Borislav Petkov Cc: Thomas Gleixner, Ingo Molnar, the arch/x86 maintainers, Nathan Chancellor, Nick Desaulniers, Arnd Bergmann, H. Peter Anvin, linux-kernel, clang-built-linux On Thu, Feb 25, 2021 at 12:45 PM Borislav Petkov <bp@alien8.de> wrote: > On Thu, Feb 25, 2021 at 12:22:41PM +0100, Arnd Bergmann wrote: > > -static inline void get_smp_config(void) > > +static inline __init void get_smp_config(void) > > __always_inline then I guess. > > Not inlining those is just silly. Either way works correctly, I don't care much, but picked the __init annotation as it seemed more intuitive. If the compiler decides to make it out-of-line for whatever reason, I see no point in telling it otherwise, even though I agree it is a bit silly. Should I send the patch with __always_inline? Arnd ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: mark some mpspec inline functions as __init 2021-02-25 12:18 ` Arnd Bergmann @ 2021-02-25 12:42 ` Borislav Petkov 2021-02-25 13:20 ` Arnd Bergmann 0 siblings, 1 reply; 13+ messages in thread From: Borislav Petkov @ 2021-02-25 12:42 UTC (permalink / raw) To: Arnd Bergmann Cc: Thomas Gleixner, Ingo Molnar, the arch/x86 maintainers, Nathan Chancellor, Nick Desaulniers, Arnd Bergmann, H. Peter Anvin, linux-kernel, clang-built-linux On Thu, Feb 25, 2021 at 01:18:21PM +0100, Arnd Bergmann wrote: > Either way works correctly, I don't care much, but picked the __init > annotation as it seemed more intuitive. If the compiler decides to > make it out-of-line for whatever reason, Well, frankly, I see no good reason for not inlining a function body which is a single call. And gcc does it just fine. And previous clangs did too, so why does clang-13 do it differently? IOW, could it be that you're fixing something that ain't broke? > I see no point in telling it otherwise, even though I agree it is a > bit silly. > > Should I send the patch with __always_inline? I guess. Although from where I'm standing, it looks like clang-13 needs fixing. But I surely don't know the whole story and "inline" is not forcing the inlining so I guess a compiler is free to do what it wants here. Apparently. And I guess telling it that those should be always inlined makes it perfectly clear then. But WTH do I know... Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: mark some mpspec inline functions as __init 2021-02-25 12:42 ` Borislav Petkov @ 2021-02-25 13:20 ` Arnd Bergmann 2021-02-25 20:31 ` Nick Desaulniers 0 siblings, 1 reply; 13+ messages in thread From: Arnd Bergmann @ 2021-02-25 13:20 UTC (permalink / raw) To: Borislav Petkov Cc: Thomas Gleixner, Ingo Molnar, the arch/x86 maintainers, Nathan Chancellor, Nick Desaulniers, Arnd Bergmann, H. Peter Anvin, linux-kernel, clang-built-linux On Thu, Feb 25, 2021 at 1:42 PM Borislav Petkov <bp@alien8.de> wrote: > > On Thu, Feb 25, 2021 at 01:18:21PM +0100, Arnd Bergmann wrote: > > Either way works correctly, I don't care much, but picked the __init > > annotation as it seemed more intuitive. If the compiler decides to > > make it out-of-line for whatever reason, > > Well, frankly, I see no good reason for not inlining a function body > which is a single call. And gcc does it just fine. And previous clangs > did too, so why does clang-13 do it differently? > > IOW, could it be that you're fixing something that ain't broke? Maybe Nick can give some more background here. He mentioned elsewhere that inlining in clang-13 was completely rewritten and is generally better than it was before. I'm not sure whether this particular instance is a case where clang could in fact show an advantage in not inlining a function, or if this is one case where it got worse and needs to be tuned better. In the end, inlining is a bunch of heuristics that are intended help most of the time, but both (old) clang and gcc get it wrong in cases that should have been decided the other way. Here we get a new method that may go wrong in other ways. Arnd ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: mark some mpspec inline functions as __init 2021-02-25 13:20 ` Arnd Bergmann @ 2021-02-25 20:31 ` Nick Desaulniers 2021-02-25 21:33 ` Borislav Petkov 2021-02-25 22:54 ` Peter Zijlstra 0 siblings, 2 replies; 13+ messages in thread From: Nick Desaulniers @ 2021-02-25 20:31 UTC (permalink / raw) To: Borislav Petkov Cc: Thomas Gleixner, Ingo Molnar, the arch/x86 maintainers, Nathan Chancellor, Arnd Bergmann, H. Peter Anvin, linux-kernel, clang-built-linux, Arnd Bergmann, Arthur Eubanks, Chandler Carruth On Thu, Feb 25, 2021 at 5:20 AM Arnd Bergmann <arnd@kernel.org> wrote: > > On Thu, Feb 25, 2021 at 1:42 PM Borislav Petkov <bp@alien8.de> wrote: > > > > On Thu, Feb 25, 2021 at 01:18:21PM +0100, Arnd Bergmann wrote: > > > Either way works correctly, I don't care much, but picked the __init > > > annotation as it seemed more intuitive. If the compiler decides to > > > make it out-of-line for whatever reason, > > > > Well, frankly, I see no good reason for not inlining a function body > > which is a single call. And gcc does it just fine. And previous clangs > > did too, so why does clang-13 do it differently? > > > > IOW, could it be that you're fixing something that ain't broke? It's a reasonable question to ask: "why didn't this get inlined?" which is worthwhile to revisit on occasion. I'll post a more generic answer than for this specific case since there's a bit of fallout from the recent changes, and I plan to explain this in one place then share lore links on all of the other threads. Q: What changed? A: Large changes to LLVM's pass management have finally been enabled by default in ToT LLVM (what will ship as clang-13). Specifically, LLVM's "new pass manager" has been made the default after probably at least 7 years worth of effort. (The previous system is known as the "legacy pass manager.") https://bugs.llvm.org/show_bug.cgi?id=46649. With this change, many heuristics related to inlining have changed. Cost models have changed, thresholds have been adjusted. I suspect there's changes in what granularity of IR gets optimized, but I need to do more research here. Next on my reading list, for anyone who wants to learn more is: "Passes in LLVM": https://llvm.org/devmtg/2014-04/PDFs/Talks/Passes.pdf https://www.youtube.com/watch?v=rY02LT08-J8 "The LLVM Pass Manager, Part 2" https://www.youtube.com/watch?v=6NDQl544yEg https://llvm.org/devmtg/2014-10/Slides/Carruth-TheLLVMPassManagerPart2.pdf Q: But it's just a single function call, what gives? A: Imagine you have the following: void baz(void); void bar(void) { baz(); ... } void foo(void) { bar(); ... } So foo() calls bar() calls baz() (and tail calls don't apply). The question being asked in this specific case (and a few other threads) is "why isn't the call to bar() being inlined into foo()?" Again, totally fair question to ask. (In this analogy, bar() is get_smp_config()/early_get_smp_config(), foo() is their caller(s, plural potentially)). The answer lies in the direction that inlining occurs. If you're doing "top down" inlining, then when looking at the edge between foo() and bar(), inlining looks totally reasonable. Inline substitution is performed. Then you get to the edge that existed between bar() and baz() and decide perhaps that baz() is too big, has too many callers, etc. and don't inline baz() into foo(). But if you're doing "bottom up" inlining, then you start by analyzing the edge between baz() and bar(), perhaps you decide to inline baz() into bar(), but now the size of bar() is just over the threshold to inline into foo(), or there's too many callers of bar() to inline into every caller without excessive code bloat, or trips the threshold for any number of concerns that go into the inlining cost model. These cost models are insanely complex (and don't fully generalize), because you need to distill a great deal many inputs into a single yes/no signal: "should I inline?" LLVM's inliner is "bottom up": https://www.youtube.com/watch?v=FnGCDLhaxKU&start=1650 This is different than GCC with is "top down": https://www.youtube.com/embed/FnGCDLhaxKU?start=2301&end=2335&autoplay=1 (That video is 6 years old at this point; I have never looked at the internals of GCC, so I don't personally know whether that is still the case, FWIW) Q: Why doesn't LLVM just do what GCC does? A: The above video should help a little, but this is something not mandated by the standard. There are tradeoffs to either, and only local optima, not general solutions. (https://www.youtube.com/watch?v=FnGCDLhaxKU&start=1563 is still my favorite reference that hints at some of the tradeoffs). Q: But I put the `inline` keyword on the callee? A: Probably deserves its own post, but the `inline` keyword doesn't mean what any rational initial impression would suppose. Language in the standard refers to "inline substitution" and grants a lot of leeway to implementations in terms of whether it's performed at all. There are cases where even with "always_inline" fn attr is applied, and the compiler says "that's nice, but I still cannot perform inline substitution here, I'm sorry." Playing with the inlining heuristics is always difficult, because for each improvement, code that has "ossified" around how inlining was previously done may experience unexpected changes in optimizations performed (see also "Q: What changed?" above). https://www.youtube.com/watch?v=FnGCDLhaxKU&start=1563 Q: Should I put "always_inline" fn attr everywhere? A: No! always_inline can still fail in edge cases, leading to inlining occurring before most optimizations so the same code gets repeatedly optimized the same way just in different functions (this can really hurt compile times). I'm not saying to completely avoid always_inline, it's good and has useful cases, but it's not a silver bullet. There are no silver bullets here. It's semantics have changed since its introduction, and I have seen rare uses that make my skin crawl. Q: Wouldn't it be nice if we could express a desire to inline from individual call sites, rather than on the callee for all call sites? A: Yes; this can be expressed in LLVM IR, but has no analog AFAIK in C or C++. Q: But what about my specific case? A: Without the configuration and compiler version provided, I can't tell you for certain (and if I stopped to look at every case, I probably wouldn't get any work done myself). But what I would do if I wanted to learn more is: 1. Isolate the kernel config that tickles this. This is pretty critical for anyone to reproduce kernel issues found via randconfig builds. 2. Isolate the compiler invocation from the specific build system for the affected TU. For the kernel, that's `make ... V=1`. 3. Rebuild LLVM in debug mode; using Cmake that's -DCMAKE_BUILD_TYPE=Debug. This gives you very powerful compiler introspection features without necessarily needing to attach a debugger. (I don't think this is needed for every pass, seems like I can use Release to debug the inliner, below). It's one of the main things I really appreciate about LLVM. 4. Ask LLVM to print diagnostics about certain passes when they are run. Using the the compiler invocation from 2, add `-mllvm -debug-only=<pass name>`. Yes, looking up <pass name> requires some grepping of LLVM source code. (Lots more general tips: https://www.youtube.com/watch?v=y4b-sgp6VYA) For the inliner: $ cat foo.c void baz(void); void bar(void) { baz(); } void foo(void) { bar(); } $ clang -mllvm -debug-only=inline foo.c -O2 -S Inlining calls in: foo Inlining calls in: foo Inlining (cost=0, threshold=337), Call: call void @bar() Updated inlining SCC: (foo) So LLVM is telling us bar() was inlined into foo(); (baz() can't be because it wasn't defined in this TU). You can use this to "watch" the compiler make decisions about inlining. (full thread: https://lore.kernel.org/lkml/20210225112247.2240389-1-arnd@kernel.org/) I suspect in this specific case, "Interprocedural Sparse Conditional Constant Propagation" sees the calls to the same fn with different constants, propagates those down creating two specialized versions of the callee (so they are distinct functions now), inlines those into get_smp_config()/early_get_smp_config(), then there's too many callers of those in a single TU where inlining would cause excessive code bloat. Either way, it's a deep topic and I'm always happy to learn more about it to help answer questions. Would make a fun blog post or LWN article. Now if only this toolchain would stop catching fire so that I had the time to write such a post... > > Maybe Nick can give some more background here. He mentioned > elsewhere that inlining in clang-13 was completely rewritten and is generally > better than it was before. I'm not sure whether this particular instance > is a case where clang could in fact show an advantage in not inlining > a function, or if this is one case where it got worse and needs to be > tuned better. > > In the end, inlining is a bunch of heuristics that are intended help > most of the time, but both (old) clang and gcc get it wrong in cases that > should have been decided the other way. Here we get a new method > that may go wrong in other ways. > > Arnd -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: mark some mpspec inline functions as __init 2021-02-25 20:31 ` Nick Desaulniers @ 2021-02-25 21:33 ` Borislav Petkov 2021-02-25 21:58 ` Nick Desaulniers 2021-02-25 22:54 ` Peter Zijlstra 1 sibling, 1 reply; 13+ messages in thread From: Borislav Petkov @ 2021-02-25 21:33 UTC (permalink / raw) To: Nick Desaulniers Cc: Thomas Gleixner, Ingo Molnar, the arch/x86 maintainers, Nathan Chancellor, Arnd Bergmann, H. Peter Anvin, linux-kernel, clang-built-linux, Arnd Bergmann, Arthur Eubanks, Chandler Carruth Hi Nick, On Thu, Feb 25, 2021 at 12:31:33PM -0800, Nick Desaulniers wrote: > So LLVM is telling us bar() was inlined into foo(); (baz() can't be > because it wasn't defined in this TU). You can use this to "watch" > the compiler make decisions about inlining. thanks for taking the time to write all this - it is very interesting and reminds me that I simply won't have time in this life of mine to learn about compiler inlining - that's a whole another universe. :-) I hope you can use that text in a blog post too - it is an interesting read. > (full thread: https://lore.kernel.org/lkml/20210225112247.2240389-1-arnd@kernel.org/) > I suspect in this specific case, "Interprocedural Sparse Conditional > Constant Propagation" sees the calls to the same fn with different > constants, propagates those down creating two specialized versions of > the callee (so they are distinct functions now), inlines those into > get_smp_config()/early_get_smp_config(), then there's too many callers > of those in a single TU where inlining would cause excessive code > bloat. Well, there's exactly one caller of get_smp_config - that's setup_arch(). early_get_smp_config() gets called also exactly once in amd_numa_init(). Now, with my simplistic approach, I can replace the lines at those call sites by hand with the x86_init.mpparse.get_smp_config(<arg>); call. So those become exactly one function call. I still don't see how that can be done any differently, frankly. But apparently the cost model has decided that this is not inlineable. Maybe because that function ptr is assigned at boot time and that somehow gets the cost model to give it a very high (or low) value. Or maybe because the wrappers are calling through a variable - the x86_init thing - which is in a different section and that confuses the inliner. Or whatever - totally speculating here. And this brings me to my point - you can't expect people to do all that crazy dance of compiler introspection and understand cost models and compiler optimization just to fix stuff like that. Now, imagine we "fix" this to clang-13's inliner's satisfaction. Now imagine too that gcc Version Next changes their inliner and that inliner says that that "fix" is wrong, for whatever reason, bottom up, top down, whatever. Do you feel the annoyance all around? And since, as you say, there are no silver bullets here, I think for cases like that we'll need a "I know what I'm doing Mr. Compiler, TYVM, even if your cost model says otherwise" facility. And in this case I still think __always_inline is correct. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: mark some mpspec inline functions as __init 2021-02-25 21:33 ` Borislav Petkov @ 2021-02-25 21:58 ` Nick Desaulniers 2021-02-25 22:16 ` Arnd Bergmann 2021-02-26 8:13 ` Borislav Petkov 0 siblings, 2 replies; 13+ messages in thread From: Nick Desaulniers @ 2021-02-25 21:58 UTC (permalink / raw) To: Borislav Petkov Cc: Thomas Gleixner, Ingo Molnar, the arch/x86 maintainers, Nathan Chancellor, Arnd Bergmann, H. Peter Anvin, linux-kernel, clang-built-linux, Arnd Bergmann On Thu, Feb 25, 2021 at 1:33 PM Borislav Petkov <bp@alien8.de> wrote: > > On Thu, Feb 25, 2021 at 12:31:33PM -0800, Nick Desaulniers wrote: > > (full thread: https://lore.kernel.org/lkml/20210225112247.2240389-1-arnd@kernel.org/) > > I suspect in this specific case, "Interprocedural Sparse Conditional > > Constant Propagation" sees the calls to the same fn with different > > constants, propagates those down creating two specialized versions of > > the callee (so they are distinct functions now), inlines those into > > get_smp_config()/early_get_smp_config(), then there's too many callers > > of those in a single TU where inlining would cause excessive code > > bloat. > > Well, there's exactly one caller of get_smp_config - that's setup_arch(). > early_get_smp_config() gets called also exactly once in amd_numa_init(). > > Now, with my simplistic approach, I can replace the lines at those call > sites by hand with the > > x86_init.mpparse.get_smp_config(<arg>); > > call. So those become exactly one function call. I still don't see how > that can be done any differently, frankly. > > But apparently the cost model has decided that this is not inlineable. > Maybe because that function ptr is assigned at boot time and that > somehow gets the cost model to give it a very high (or low) value. Or > maybe because the wrappers are calling through a variable - the x86_init > thing - which is in a different section and that confuses the inliner. > Or whatever - totally speculating here. The config that reproduces it wasn't shared here; I wouldn't be surprised if this was found via randconfig that enabled some config that led to excessive code bloat somewhere somehow. > > And this brings me to my point - you can't expect people to do all that > crazy dance of compiler introspection and understand cost models and > compiler optimization just to fix stuff like that. Oh, I don't expect everyone to; just leaving breadcrumbs showing other people on thread how to fish. ;) > > Now, imagine we "fix" this to clang-13's inliner's satisfaction. Now > imagine too that gcc Version Next changes their inliner and that inliner > says that that "fix" is wrong, for whatever reason, bottom up, top down, > whatever. Do you feel the annoyance all around? Yes, mutually unsatisfiable cases are painful, but I don't think that's what's going on here. > > And since, as you say, there are no silver bullets here, I think for > cases like that we'll need a "I know what I'm doing Mr. Compiler, TYVM, > even if your cost model says otherwise" facility. And in this case I > still think __always_inline is correct. Sure, it doesn't really matter to me which way this is fixed. I personally prefer placing functions in the correct sections and letting the compiler be flexible, since if all of this is to satisfy some randconfig then __always_inline is making a decision for all configs, but perhaps it doesn't matter. -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: mark some mpspec inline functions as __init 2021-02-25 21:58 ` Nick Desaulniers @ 2021-02-25 22:16 ` Arnd Bergmann 2021-02-26 8:13 ` Borislav Petkov 1 sibling, 0 replies; 13+ messages in thread From: Arnd Bergmann @ 2021-02-25 22:16 UTC (permalink / raw) To: Nick Desaulniers Cc: Borislav Petkov, Thomas Gleixner, Ingo Molnar, the arch/x86 maintainers, Nathan Chancellor, Arnd Bergmann, H. Peter Anvin, linux-kernel, clang-built-linux On Thu, Feb 25, 2021 at 10:59 PM 'Nick Desaulniers' via Clang Built Linux <clang-built-linux@googlegroups.com> wrote: > On Thu, Feb 25, 2021 at 1:33 PM Borislav Petkov <bp@alien8.de> wrote: > > > > But apparently the cost model has decided that this is not inlineable. > > Maybe because that function ptr is assigned at boot time and that > > somehow gets the cost model to give it a very high (or low) value. Or > > maybe because the wrappers are calling through a variable - the x86_init > > thing - which is in a different section and that confuses the inliner. > > Or whatever - totally speculating here. > > The config that reproduces it wasn't shared here; I wouldn't be > surprised if this was found via randconfig that enabled some config > that led to excessive code bloat somewhere somehow. It was the case in some similar bugfixes I sent, but in this one, the out-of-line function was fairly trivial -- there is one call to a sanitizer but that's it: .section .init.text,"ax",@progbits .type early_get_smp_config,@function # -- Begin function early_get_smp_config early_get_smp_config: # @early_get_smp_config # %bb.0: callq __sanitizer_cov_trace_pc@PLT movl $1, %edi jmpq *x86_init+40(%rip) # TAILCALL .Lfunc_end4: .size early_get_smp_config, .Lfunc_end4-early_get_smp_config I've uploaded the .config to https://pastebin.com/CvvEScnQ This is with today's linux-next-20210225 plus a number of patches to address other build failures. Arnd ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: mark some mpspec inline functions as __init 2021-02-25 21:58 ` Nick Desaulniers 2021-02-25 22:16 ` Arnd Bergmann @ 2021-02-26 8:13 ` Borislav Petkov 2021-02-26 13:24 ` Arnd Bergmann 1 sibling, 1 reply; 13+ messages in thread From: Borislav Petkov @ 2021-02-26 8:13 UTC (permalink / raw) To: Nick Desaulniers Cc: Thomas Gleixner, Ingo Molnar, the arch/x86 maintainers, Nathan Chancellor, Arnd Bergmann, H. Peter Anvin, linux-kernel, clang-built-linux, Arnd Bergmann On Thu, Feb 25, 2021 at 01:58:48PM -0800, Nick Desaulniers wrote: > The config that reproduces it wasn't shared here; I wouldn't be > surprised if this was found via randconfig that enabled some config > that led to excessive code bloat somewhere somehow. I'm sceptical it is the .config. As I said, those single function calls which I could replace by hand - the wrappers simply make the code cleaner. They could just as well be macros FWIW and then the inlining will be practically forced at preprocess time. > Oh, I don't expect everyone to; just leaving breadcrumbs showing other > people on thread how to fish. ;) Yap, that's appreciated. > Sure, it doesn't really matter to me which way this is fixed. I > personally prefer placing functions in the correct sections and > letting the compiler be flexible, since if all of this is to satisfy > some randconfig then __always_inline is making a decision for all > configs, but perhaps it doesn't matter. I hope. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: mark some mpspec inline functions as __init 2021-02-26 8:13 ` Borislav Petkov @ 2021-02-26 13:24 ` Arnd Bergmann 2021-02-27 15:11 ` Arnd Bergmann 0 siblings, 1 reply; 13+ messages in thread From: Arnd Bergmann @ 2021-02-26 13:24 UTC (permalink / raw) To: Borislav Petkov Cc: Nick Desaulniers, Thomas Gleixner, Ingo Molnar, the arch/x86 maintainers, Nathan Chancellor, Arnd Bergmann, H. Peter Anvin, linux-kernel, clang-built-linux On Fri, Feb 26, 2021 at 9:13 AM Borislav Petkov <bp@alien8.de> wrote: > > On Thu, Feb 25, 2021 at 01:58:48PM -0800, Nick Desaulniers wrote: > > The config that reproduces it wasn't shared here; I wouldn't be > > surprised if this was found via randconfig that enabled some config > > that led to excessive code bloat somewhere somehow. > > I'm sceptical it is the .config. As I said, those single function calls > which I could replace by hand - the wrappers simply make the code > cleaner. They could just as well be macros FWIW and then the inlining > will be practically forced at preprocess time. I managed to track down the configurations: This particular function is not inlined whenever CONFIG_UBSAN_OBJECT_SIZE is enabled and CONFIG_UBSAN_TRAP is disabled, plus obviously any configuration option that is needed to build the file. Arnd ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: mark some mpspec inline functions as __init 2021-02-26 13:24 ` Arnd Bergmann @ 2021-02-27 15:11 ` Arnd Bergmann 0 siblings, 0 replies; 13+ messages in thread From: Arnd Bergmann @ 2021-02-27 15:11 UTC (permalink / raw) To: Borislav Petkov Cc: Nick Desaulniers, Thomas Gleixner, Ingo Molnar, the arch/x86 maintainers, Nathan Chancellor, Arnd Bergmann, H. Peter Anvin, linux-kernel, clang-built-linux On Fri, Feb 26, 2021 at 2:24 PM Arnd Bergmann <arnd@kernel.org> wrote: > > On Fri, Feb 26, 2021 at 9:13 AM Borislav Petkov <bp@alien8.de> wrote: > > > > On Thu, Feb 25, 2021 at 01:58:48PM -0800, Nick Desaulniers wrote: > > > The config that reproduces it wasn't shared here; I wouldn't be > > > surprised if this was found via randconfig that enabled some config > > > that led to excessive code bloat somewhere somehow. > > > > I'm sceptical it is the .config. As I said, those single function calls > > which I could replace by hand - the wrappers simply make the code > > cleaner. They could just as well be macros FWIW and then the inlining > > will be practically forced at preprocess time. > > I managed to track down the configurations: This particular function is > not inlined whenever CONFIG_UBSAN_OBJECT_SIZE is enabled > and CONFIG_UBSAN_TRAP is disabled, plus obviously any > configuration option that is needed to build the file. And I now had another look at the output after reducing the test case with cvise to: struct b { void *c; }; struct { struct b d; } extern e; int f; __attribute__((__cold__)) int a(); static inline void early_get_smp_config() {(void) e.d.c; } int g() { if (a()) return 2; a(); if (f) return f; a(); early_get_smp_config(); return 0; } See https://godbolt.org/z/8qbY65 Some observations: - The early_get_smp_config function literally does nothing in the reduced test case, but is still not inlined. - This happens regardless of target architecture - It happens in a code path of the calling function that is 'cold' at this point, which presumably is an indication to clang that any functions called from here are also cold, and not worth inlining. - I have found no indication why -fsanitize=object-size should make a difference. Arnd ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: mark some mpspec inline functions as __init 2021-02-25 20:31 ` Nick Desaulniers 2021-02-25 21:33 ` Borislav Petkov @ 2021-02-25 22:54 ` Peter Zijlstra 1 sibling, 0 replies; 13+ messages in thread From: Peter Zijlstra @ 2021-02-25 22:54 UTC (permalink / raw) To: Nick Desaulniers Cc: Borislav Petkov, Thomas Gleixner, Ingo Molnar, the arch/x86 maintainers, Nathan Chancellor, Arnd Bergmann, H. Peter Anvin, linux-kernel, clang-built-linux, Arnd Bergmann, Arthur Eubanks, Chandler Carruth On Thu, Feb 25, 2021 at 12:31:33PM -0800, Nick Desaulniers wrote: > Q: But I put the `inline` keyword on the callee? > A: Probably deserves its own post, but the `inline` keyword doesn't > mean what any rational initial impression would suppose. Language in > the standard refers to "inline substitution" and grants a lot of > leeway to implementations in terms of whether it's performed at all. That's just weasle wording to do the wrong thing :/ GCC (and other compilers) have been saying inline is a valid substitute for macros. You then reading the spec and saying we can do this ass backwards and screw everybody who's been expecting things to top-down is just not acceptible. The C spec is notoriously bad, because it fails to actually specify and then we get shit like this :-( And if you then take that 'freedom' and implement behaviour that 'doesn't mean what any rational initial impression would suppose', then you're just being a twat. Life is hard enough without compilers trying to screw you over on purpose :/ > There are cases where even with "always_inline" fn attr is applied, > and the compiler says "that's nice, but I still cannot perform inline > substitution here, I'm sorry." That's a hard fail from where I'm sitting. Can we please get a compiler error when that happens? That is, we're relying on this for correctness, if the compiler then goes and ignores it, we've got *serious* problems. This really MUST NOT happen. And I'm not joking, I can get you a non-booting kernel by removing just a few __always_inline in the right place. > There are no silver bullets here. It's semantics have > changed since its introduction, That isn't our doing, if the compiler redefines its keywords, don't tell us our program is broken. You're the ones that have been changing the meaning of thigs. > and I have seen rare uses that make my skin crawl. Do tell... ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2021-02-27 15:12 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-02-25 11:22 [PATCH] x86: mark some mpspec inline functions as __init Arnd Bergmann 2021-02-25 11:45 ` Borislav Petkov 2021-02-25 12:18 ` Arnd Bergmann 2021-02-25 12:42 ` Borislav Petkov 2021-02-25 13:20 ` Arnd Bergmann 2021-02-25 20:31 ` Nick Desaulniers 2021-02-25 21:33 ` Borislav Petkov 2021-02-25 21:58 ` Nick Desaulniers 2021-02-25 22:16 ` Arnd Bergmann 2021-02-26 8:13 ` Borislav Petkov 2021-02-26 13:24 ` Arnd Bergmann 2021-02-27 15:11 ` Arnd Bergmann 2021-02-25 22:54 ` Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).