* objtool warnings for kernel/trace/trace_selftest_dynamic.o @ 2018-12-16 18:33 Arnd Bergmann 2018-12-17 17:39 ` Josh Poimboeuf 0 siblings, 1 reply; 32+ messages in thread From: Arnd Bergmann @ 2018-12-16 18:33 UTC (permalink / raw) To: Josh Poimboeuf Cc: Peter Zijlstra, Linux Kernel Mailing List, the arch/x86 maintainers [-- Attachment #1: Type: text/plain, Size: 1657 bytes --] Hi Josh, In randconfig tests with gcc-8.1, I get this warning every few hundred builds, tried it on both next/master and 4.19.y-stable: kernel/trace/trace_selftest_dynamic.o: warning: objtool: trace_selftest_dynamic_test_func()+0x5: call without frame pointer save/setup kernel/trace/trace_selftest_dynamic.o: warning: objtool: trace_selftest_dynamic_test_func2()+0x5: call without frame pointer save/setup $ objdump -dr build/x86/0x90C84554_defconfig/kernel/trace/trace_selftest_dynamic.o build/x86/0x90C84554_defconfig/kernel/trace/trace_selftest_dynamic.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <trace_selftest_dynamic_test_func>: 0: e8 00 00 00 00 callq 5 <trace_selftest_dynamic_test_func+0x5> 1: R_X86_64_PC32 __fentry__-0x4 5: e8 00 00 00 00 callq a <trace_selftest_dynamic_test_func+0xa> 6: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4 a: 31 c0 xor %eax,%eax c: c3 retq d: 0f 1f 00 nopl (%rax) 0000000000000010 <trace_selftest_dynamic_test_func2>: 10: e8 00 00 00 00 callq 15 <trace_selftest_dynamic_test_func2+0x5> 11: R_X86_64_PC32 __fentry__-0x4 15: e8 00 00 00 00 callq 1a <trace_selftest_dynamic_test_func2+0xa> 16: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4 1a: 31 c0 xor %eax,%eax 1c: c3 retq I found this reported in http://kisskb.ellerman.id.au/kisskb/buildresult/13499139/, but could not find an existing fix or analysis. Arnd [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 25916 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-16 18:33 objtool warnings for kernel/trace/trace_selftest_dynamic.o Arnd Bergmann @ 2018-12-17 17:39 ` Josh Poimboeuf 2018-12-17 18:04 ` Andi Kleen 0 siblings, 1 reply; 32+ messages in thread From: Josh Poimboeuf @ 2018-12-17 17:39 UTC (permalink / raw) To: Arnd Bergmann Cc: Peter Zijlstra, Linux Kernel Mailing List, the arch/x86 maintainers, Andi Kleen, Steven Rostedt, Miroslav Benes On Sun, Dec 16, 2018 at 07:33:11PM +0100, Arnd Bergmann wrote: > Hi Josh, > > In randconfig tests with gcc-8.1, I get this warning every > few hundred builds, tried it on both next/master and 4.19.y-stable: > > kernel/trace/trace_selftest_dynamic.o: warning: objtool: > trace_selftest_dynamic_test_func()+0x5: call without frame pointer > save/setup > kernel/trace/trace_selftest_dynamic.o: warning: objtool: > trace_selftest_dynamic_test_func2()+0x5: call without frame pointer > save/setup > > $ objdump -dr build/x86/0x90C84554_defconfig/kernel/trace/trace_selftest_dynamic.o > > build/x86/0x90C84554_defconfig/kernel/trace/trace_selftest_dynamic.o: > file format elf64-x86-64 > > Disassembly of section .text: > > 0000000000000000 <trace_selftest_dynamic_test_func>: > 0: e8 00 00 00 00 callq 5 > <trace_selftest_dynamic_test_func+0x5> > 1: R_X86_64_PC32 __fentry__-0x4 > 5: e8 00 00 00 00 callq a > <trace_selftest_dynamic_test_func+0xa> > 6: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4 > a: 31 c0 xor %eax,%eax > c: c3 retq > d: 0f 1f 00 nopl (%rax) > > 0000000000000010 <trace_selftest_dynamic_test_func2>: > 10: e8 00 00 00 00 callq 15 > <trace_selftest_dynamic_test_func2+0x5> > 11: R_X86_64_PC32 __fentry__-0x4 > 15: e8 00 00 00 00 callq 1a > <trace_selftest_dynamic_test_func2+0xa> > 16: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4 > 1a: 31 c0 xor %eax,%eax > 1c: c3 retq > > I found this reported in > http://kisskb.ellerman.id.au/kisskb/buildresult/13499139/, but could > not find an existing fix or analysis. Thanks for reporting this Arnd. The problem is that, for some reason, __noclone is preventing GCC from creating frame pointers for these functions. Miroslav said that __noclone is not recommended by GCC developers, and that __used can be used instead for the same purpose: https://lkml.kernel.org/r/alpine.LSU.2.21.1812171256390.3087@pobox.suse.cz Andi, is __noclone really needed here, since the functions aren't static? Or does LTO cause them to be treated like static functions? Or if it is really needed, would __used be sufficient instead? noinline __noclone int DYN_FTRACE_TEST_NAME(void) { /* used to call mcount */ return 0; } noinline __noclone int DYN_FTRACE_TEST_NAME2(void) { /* used to call mcount */ return 0; } -- Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-17 17:39 ` Josh Poimboeuf @ 2018-12-17 18:04 ` Andi Kleen 2018-12-17 18:16 ` Josh Poimboeuf 0 siblings, 1 reply; 32+ messages in thread From: Andi Kleen @ 2018-12-17 18:04 UTC (permalink / raw) To: Josh Poimboeuf Cc: Arnd Bergmann, Peter Zijlstra, Linux Kernel Mailing List, the arch/x86 maintainers, Steven Rostedt, Miroslav Benes On Mon, Dec 17, 2018 at 11:39:00AM -0600, Josh Poimboeuf wrote: > On Sun, Dec 16, 2018 at 07:33:11PM +0100, Arnd Bergmann wrote: > > Hi Josh, > > > > In randconfig tests with gcc-8.1, I get this warning every > > few hundred builds, tried it on both next/master and 4.19.y-stable: > > > > kernel/trace/trace_selftest_dynamic.o: warning: objtool: > > trace_selftest_dynamic_test_func()+0x5: call without frame pointer > > save/setup > > kernel/trace/trace_selftest_dynamic.o: warning: objtool: > > trace_selftest_dynamic_test_func2()+0x5: call without frame pointer > > save/setup > > > > $ objdump -dr build/x86/0x90C84554_defconfig/kernel/trace/trace_selftest_dynamic.o > > > > build/x86/0x90C84554_defconfig/kernel/trace/trace_selftest_dynamic.o: > > file format elf64-x86-64 > > > > Disassembly of section .text: > > > > 0000000000000000 <trace_selftest_dynamic_test_func>: > > 0: e8 00 00 00 00 callq 5 > > <trace_selftest_dynamic_test_func+0x5> > > 1: R_X86_64_PC32 __fentry__-0x4 > > 5: e8 00 00 00 00 callq a > > <trace_selftest_dynamic_test_func+0xa> > > 6: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4 > > a: 31 c0 xor %eax,%eax > > c: c3 retq > > d: 0f 1f 00 nopl (%rax) > > > > 0000000000000010 <trace_selftest_dynamic_test_func2>: > > 10: e8 00 00 00 00 callq 15 > > <trace_selftest_dynamic_test_func2+0x5> > > 11: R_X86_64_PC32 __fentry__-0x4 > > 15: e8 00 00 00 00 callq 1a > > <trace_selftest_dynamic_test_func2+0xa> > > 16: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4 > > 1a: 31 c0 xor %eax,%eax > > 1c: c3 retq > > > > I found this reported in > > http://kisskb.ellerman.id.au/kisskb/buildresult/13499139/, but could > > not find an existing fix or analysis. > > Thanks for reporting this Arnd. > > The problem is that, for some reason, __noclone is preventing GCC from > creating frame pointers for these functions. Miroslav said that That seems weird. Are you sure it's not just because they are empty? AFAIK gcc doesn't necessarily generate frame pointers for empty functions. > __noclone is not recommended by GCC developers, and that __used can be > used instead for the same purpose: > > https://lkml.kernel.org/r/alpine.LSU.2.21.1812171256390.3087@pobox.suse.cz > > Andi, > > is __noclone really needed here, since the functions aren't static? Or > does LTO cause them to be treated like static functions? Yes LTO causes the to be treated like static functions. I guess noclone is unlikely to be really needed here because these functions are unlikely to be cloned. So as a workaround it could be removed. But note we have other noclone functions in the tree (like in KVM) which actually need it. -Andi ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-17 18:04 ` Andi Kleen @ 2018-12-17 18:16 ` Josh Poimboeuf 2018-12-17 19:29 ` Peter Zijlstra 2018-12-17 21:03 ` Andi Kleen 0 siblings, 2 replies; 32+ messages in thread From: Josh Poimboeuf @ 2018-12-17 18:16 UTC (permalink / raw) To: Andi Kleen Cc: Arnd Bergmann, Peter Zijlstra, Linux Kernel Mailing List, the arch/x86 maintainers, Steven Rostedt, Miroslav Benes On Mon, Dec 17, 2018 at 10:04:34AM -0800, Andi Kleen wrote: > On Mon, Dec 17, 2018 at 11:39:00AM -0600, Josh Poimboeuf wrote: > > On Sun, Dec 16, 2018 at 07:33:11PM +0100, Arnd Bergmann wrote: > > > Hi Josh, > > > > > > In randconfig tests with gcc-8.1, I get this warning every > > > few hundred builds, tried it on both next/master and 4.19.y-stable: > > > > > > kernel/trace/trace_selftest_dynamic.o: warning: objtool: > > > trace_selftest_dynamic_test_func()+0x5: call without frame pointer > > > save/setup > > > kernel/trace/trace_selftest_dynamic.o: warning: objtool: > > > trace_selftest_dynamic_test_func2()+0x5: call without frame pointer > > > save/setup > > > > > > $ objdump -dr build/x86/0x90C84554_defconfig/kernel/trace/trace_selftest_dynamic.o > > > > > > build/x86/0x90C84554_defconfig/kernel/trace/trace_selftest_dynamic.o: > > > file format elf64-x86-64 > > > > > > Disassembly of section .text: > > > > > > 0000000000000000 <trace_selftest_dynamic_test_func>: > > > 0: e8 00 00 00 00 callq 5 > > > <trace_selftest_dynamic_test_func+0x5> > > > 1: R_X86_64_PC32 __fentry__-0x4 > > > 5: e8 00 00 00 00 callq a > > > <trace_selftest_dynamic_test_func+0xa> > > > 6: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4 > > > a: 31 c0 xor %eax,%eax > > > c: c3 retq > > > d: 0f 1f 00 nopl (%rax) > > > > > > 0000000000000010 <trace_selftest_dynamic_test_func2>: > > > 10: e8 00 00 00 00 callq 15 > > > <trace_selftest_dynamic_test_func2+0x5> > > > 11: R_X86_64_PC32 __fentry__-0x4 > > > 15: e8 00 00 00 00 callq 1a > > > <trace_selftest_dynamic_test_func2+0xa> > > > 16: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4 > > > 1a: 31 c0 xor %eax,%eax > > > 1c: c3 retq > > > > > > I found this reported in > > > http://kisskb.ellerman.id.au/kisskb/buildresult/13499139/, but could > > > not find an existing fix or analysis. > > > > Thanks for reporting this Arnd. > > > > The problem is that, for some reason, __noclone is preventing GCC from > > creating frame pointers for these functions. Miroslav said that > > That seems weird. > > Are you sure it's not just because they are empty? AFAIK > gcc doesn't necessarily generate frame pointers for empty functions. I suspected that it was because they're empty, however I didn't see this warning for other leaf functions. The sancov plugin is presumably taking care of adding frame pointers where needed. Also, adding -mno-omit-leaf-frame-pointer didn't fix it. And anyway I confirmed that it was fixed by removing __noclone. > > __noclone is not recommended by GCC developers, and that __used can be > > used instead for the same purpose: > > > > https://lkml.kernel.org/r/alpine.LSU.2.21.1812171256390.3087@pobox.suse.cz > > > > Andi, > > > > is __noclone really needed here, since the functions aren't static? Or > > does LTO cause them to be treated like static functions? > > Yes LTO causes the to be treated like static functions. > > I guess noclone is unlikely to be really needed here because these > functions are unlikely to be cloned. > > So as a workaround it could be removed. > > But note we have other noclone functions in the tree (like in KVM) > which actually need it. How about we just use the __used attribute then? It seems to have the same result of preventing IPA optimizations (without the weird side effect of missing frame pointers). -- Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-17 18:16 ` Josh Poimboeuf @ 2018-12-17 19:29 ` Peter Zijlstra 2018-12-17 20:55 ` Andi Kleen 2018-12-17 21:31 ` Josh Poimboeuf 2018-12-17 21:03 ` Andi Kleen 1 sibling, 2 replies; 32+ messages in thread From: Peter Zijlstra @ 2018-12-17 19:29 UTC (permalink / raw) To: Josh Poimboeuf Cc: Andi Kleen, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers, Steven Rostedt, Miroslav Benes On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote: > > Yes LTO causes the to be treated like static functions. > > > > I guess noclone is unlikely to be really needed here because these > > functions are unlikely to be cloned. > > > > So as a workaround it could be removed. > > > > But note we have other noclone functions in the tree (like in KVM) > > which actually need it. > > How about we just use the __used attribute then? It seems to have the > same result of preventing IPA optimizations (without the weird side > effect of missing frame pointers). AFAIK we don't have any in-tree LTO, so it can all go in the bin. When/if we get the LTO trainwreck sorted -- which very much includes getting that memory-order-consume fixed -- we can revisit all that. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-17 19:29 ` Peter Zijlstra @ 2018-12-17 20:55 ` Andi Kleen 2018-12-17 22:35 ` Peter Zijlstra 2018-12-17 21:31 ` Josh Poimboeuf 1 sibling, 1 reply; 32+ messages in thread From: Andi Kleen @ 2018-12-17 20:55 UTC (permalink / raw) To: Peter Zijlstra Cc: Josh Poimboeuf, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers, Steven Rostedt, Miroslav Benes On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote: > On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote: > > > > Yes LTO causes the to be treated like static functions. > > > > > > I guess noclone is unlikely to be really needed here because these > > > functions are unlikely to be cloned. > > > > > > So as a workaround it could be removed. > > > > > > But note we have other noclone functions in the tree (like in KVM) > > > which actually need it. > > > > How about we just use the __used attribute then? It seems to have the > > same result of preventing IPA optimizations (without the weird side > > effect of missing frame pointers). > > AFAIK we don't have any in-tree LTO, so it can all go in the bin. I have patches for 4.20, and I was actually thinking about resending soon. It will need a few changes, but not too bad. FWIW there's also a user base who used the out of tree patches for some time. > > When/if we get the LTO trainwreck sorted -- which very much includes > getting that memory-order-consume fixed -- we can revisit all that. What do you mean? I'm not aware of any LTO problems with memory-order-consume? -Andi ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-17 20:55 ` Andi Kleen @ 2018-12-17 22:35 ` Peter Zijlstra 2018-12-17 23:59 ` Andi Kleen 0 siblings, 1 reply; 32+ messages in thread From: Peter Zijlstra @ 2018-12-17 22:35 UTC (permalink / raw) To: Andi Kleen Cc: Josh Poimboeuf, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers, Steven Rostedt, Miroslav Benes On Mon, Dec 17, 2018 at 12:55:35PM -0800, Andi Kleen wrote: > On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote: > > When/if we get the LTO trainwreck sorted -- which very much includes > > getting that memory-order-consume fixed -- we can revisit all that. > > What do you mean? I'm not aware of any LTO problems with memory-order-consume? The compiler is basically allowed to break RCU (and anything else that depends on read-read dependencies). LTO makes it _far_ more likely this happens. We need guarantees (and possible switches) from the compiler folks that this will not happen before I'll retract my NAK from any LTO enabling. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-17 22:35 ` Peter Zijlstra @ 2018-12-17 23:59 ` Andi Kleen 2018-12-18 9:19 ` Peter Zijlstra 0 siblings, 1 reply; 32+ messages in thread From: Andi Kleen @ 2018-12-17 23:59 UTC (permalink / raw) To: Peter Zijlstra Cc: Josh Poimboeuf, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers, Steven Rostedt, Miroslav Benes On Mon, Dec 17, 2018 at 11:35:24PM +0100, Peter Zijlstra wrote: > On Mon, Dec 17, 2018 at 12:55:35PM -0800, Andi Kleen wrote: > > On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote: > > > When/if we get the LTO trainwreck sorted -- which very much includes > > > getting that memory-order-consume fixed -- we can revisit all that. > > > > What do you mean? I'm not aware of any LTO problems with memory-order-consume? > > The compiler is basically allowed to break RCU (and anything else that > depends on read-read dependencies). LTO makes it _far_ more likely this > happens. > > We need guarantees (and possible switches) from the compiler folks that > this will not happen before I'll retract my NAK from any LTO enabling. Are you really saying that the current RCU code depends on cross file inlining not happening? If that is true it's quite bad. Of course it would be a untolerable situation because all kinds of changes can break it, and there would be likely already plenty of broken code in tree. I have a hard time believing it though. Do you have a concrete example or is this just FUD? BTW I have a user base for LTO and so far noone has reported any issues like this. -Andi ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-17 23:59 ` Andi Kleen @ 2018-12-18 9:19 ` Peter Zijlstra 2018-12-18 21:22 ` Andi Kleen 0 siblings, 1 reply; 32+ messages in thread From: Peter Zijlstra @ 2018-12-18 9:19 UTC (permalink / raw) To: Andi Kleen Cc: Josh Poimboeuf, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers, Steven Rostedt, Miroslav Benes On Mon, Dec 17, 2018 at 03:59:50PM -0800, Andi Kleen wrote: > BTW I have a user base for LTO and so far noone has reported any issues > like this. Because ordering issues are immediately obvious and easy to debug no doubt. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-18 9:19 ` Peter Zijlstra @ 2018-12-18 21:22 ` Andi Kleen 0 siblings, 0 replies; 32+ messages in thread From: Andi Kleen @ 2018-12-18 21:22 UTC (permalink / raw) To: Peter Zijlstra Cc: Josh Poimboeuf, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers, Steven Rostedt, Miroslav Benes On Tue, Dec 18, 2018 at 10:19:32AM +0100, Peter Zijlstra wrote: > On Mon, Dec 17, 2018 at 03:59:50PM -0800, Andi Kleen wrote: > > BTW I have a user base for LTO and so far noone has reported any issues > > like this. > > Because ordering issues are immediately obvious and easy to debug no > doubt. Again if it's a real problem it should be already there because we have a lot of code inside files that gets happily inlined. Do you have an example in the tree where it happened or could happen? -Andi ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-17 19:29 ` Peter Zijlstra 2018-12-17 20:55 ` Andi Kleen @ 2018-12-17 21:31 ` Josh Poimboeuf 2018-12-17 22:36 ` Steven Rostedt 2018-12-17 23:54 ` Andi Kleen 1 sibling, 2 replies; 32+ messages in thread From: Josh Poimboeuf @ 2018-12-17 21:31 UTC (permalink / raw) To: Peter Zijlstra Cc: Andi Kleen, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers, Steven Rostedt, Miroslav Benes On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote: > On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote: > > > > Yes LTO causes the to be treated like static functions. > > > > > > I guess noclone is unlikely to be really needed here because these > > > functions are unlikely to be cloned. > > > > > > So as a workaround it could be removed. > > > > > > But note we have other noclone functions in the tree (like in KVM) > > > which actually need it. > > > > How about we just use the __used attribute then? It seems to have the > > same result of preventing IPA optimizations (without the weird side > > effect of missing frame pointers). > > AFAIK we don't have any in-tree LTO, so it can all go in the bin. > > When/if we get the LTO trainwreck sorted -- which very much includes > getting that memory-order-consume fixed -- we can revisit all that. Ok, then if there are no objections I'll just send a revert of: dd3dad0d716d ("ftrace: Mark function tracer test functions noinline/noclone") -- Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-17 21:31 ` Josh Poimboeuf @ 2018-12-17 22:36 ` Steven Rostedt 2018-12-18 0:06 ` Andi Kleen 2018-12-18 3:05 ` Josh Poimboeuf 2018-12-17 23:54 ` Andi Kleen 1 sibling, 2 replies; 32+ messages in thread From: Steven Rostedt @ 2018-12-17 22:36 UTC (permalink / raw) To: Josh Poimboeuf Cc: Peter Zijlstra, Andi Kleen, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers, Miroslav Benes On Mon, 17 Dec 2018 15:31:26 -0600 Josh Poimboeuf <jpoimboe@redhat.com> wrote: > On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote: > > On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote: > > > > > > Yes LTO causes the to be treated like static functions. > > > > > > > > I guess noclone is unlikely to be really needed here because these > > > > functions are unlikely to be cloned. > > > > > > > > So as a workaround it could be removed. > > > > > > > > But note we have other noclone functions in the tree (like in KVM) > > > > which actually need it. > > > > > > How about we just use the __used attribute then? It seems to have the > > > same result of preventing IPA optimizations (without the weird side > > > effect of missing frame pointers). > > > > AFAIK we don't have any in-tree LTO, so it can all go in the bin. > > > > When/if we get the LTO trainwreck sorted -- which very much includes > > getting that memory-order-consume fixed -- we can revisit all that. > > Ok, then if there are no objections I'll just send a revert of: > > dd3dad0d716d ("ftrace: Mark function tracer test functions noinline/noclone") > Should it be reverted, or just remove the noclone, and keep the noinline? -- Steve ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-17 22:36 ` Steven Rostedt @ 2018-12-18 0:06 ` Andi Kleen 2018-12-18 2:49 ` Josh Poimboeuf 2018-12-18 3:05 ` Josh Poimboeuf 1 sibling, 1 reply; 32+ messages in thread From: Andi Kleen @ 2018-12-18 0:06 UTC (permalink / raw) To: Steven Rostedt Cc: Josh Poimboeuf, Peter Zijlstra, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers, Miroslav Benes On Mon, Dec 17, 2018 at 05:36:44PM -0500, Steven Rostedt wrote: > On Mon, 17 Dec 2018 15:31:26 -0600 > Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > > On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote: > > > On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote: > > > > > > > > Yes LTO causes the to be treated like static functions. > > > > > > > > > > I guess noclone is unlikely to be really needed here because these > > > > > functions are unlikely to be cloned. > > > > > > > > > > So as a workaround it could be removed. > > > > > > > > > > But note we have other noclone functions in the tree (like in KVM) > > > > > which actually need it. > > > > > > > > How about we just use the __used attribute then? It seems to have the > > > > same result of preventing IPA optimizations (without the weird side > > > > effect of missing frame pointers). > > > > > > AFAIK we don't have any in-tree LTO, so it can all go in the bin. > > > > > > When/if we get the LTO trainwreck sorted -- which very much includes > > > getting that memory-order-consume fixed -- we can revisit all that. > > > > Ok, then if there are no objections I'll just send a revert of: > > > > dd3dad0d716d ("ftrace: Mark function tracer test functions noinline/noclone") > > > > Should it be reverted, or just remove the noclone, and keep the > noinline? It should not be touched for now, until it is properly debugged. IMHO Josh's explanation doesn't make much sense and there was a lot of handwaving And just fixing one case isn't good enough because there are other noclone functions in the tree. It the problem is the plugin the plugin needs to be fixed. If the problem is gcc we need a gcc test case and bug, with some analysis, and then based on that select the proper workaround. -Andi ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-18 0:06 ` Andi Kleen @ 2018-12-18 2:49 ` Josh Poimboeuf 2018-12-18 4:22 ` Andi Kleen 2018-12-18 9:28 ` Miroslav Benes 0 siblings, 2 replies; 32+ messages in thread From: Josh Poimboeuf @ 2018-12-18 2:49 UTC (permalink / raw) To: Andi Kleen Cc: Steven Rostedt, Peter Zijlstra, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers, Miroslav Benes On Mon, Dec 17, 2018 at 04:06:18PM -0800, Andi Kleen wrote: > On Mon, Dec 17, 2018 at 05:36:44PM -0500, Steven Rostedt wrote: > > On Mon, 17 Dec 2018 15:31:26 -0600 > > Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > > > > On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote: > > > > On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote: > > > > > > > > > > Yes LTO causes the to be treated like static functions. > > > > > > > > > > > > I guess noclone is unlikely to be really needed here because these > > > > > > functions are unlikely to be cloned. > > > > > > > > > > > > So as a workaround it could be removed. > > > > > > > > > > > > But note we have other noclone functions in the tree (like in KVM) > > > > > > which actually need it. > > > > > > > > > > How about we just use the __used attribute then? It seems to have the > > > > > same result of preventing IPA optimizations (without the weird side > > > > > effect of missing frame pointers). > > > > > > > > AFAIK we don't have any in-tree LTO, so it can all go in the bin. > > > > > > > > When/if we get the LTO trainwreck sorted -- which very much includes > > > > getting that memory-order-consume fixed -- we can revisit all that. > > > > > > Ok, then if there are no objections I'll just send a revert of: > > > > > > dd3dad0d716d ("ftrace: Mark function tracer test functions noinline/noclone") Sorry for suggesting this prematurely, my email client stopped syncing and I missed your later replies to Peter about this. > > Should it be reverted, or just remove the noclone, and keep the > > noinline? > > It should not be touched for now, until it is properly debugged. > > IMHO Josh's explanation doesn't make much sense and there > was a lot of handwaving > > And just fixing one case isn't good enough because there are other > noclone functions in the tree. > > It the problem is the plugin the plugin needs to be fixed. > > If the problem is gcc we need a gcc test case and bug, with > some analysis, and then based on that select the proper workaround. The plugin is only used for older versions of GCC. Newer versions have the same functionality builtin with -fsanitize-coverage=trace-pc. So the problem is GCC. We're using a function attribute which at least oneGCC developer doesn't recommend. If you want to keep the LTO support then '__used' seems like a much better choice. -- Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-18 2:49 ` Josh Poimboeuf @ 2018-12-18 4:22 ` Andi Kleen 2018-12-18 9:28 ` Miroslav Benes 1 sibling, 0 replies; 32+ messages in thread From: Andi Kleen @ 2018-12-18 4:22 UTC (permalink / raw) To: Josh Poimboeuf Cc: Steven Rostedt, Peter Zijlstra, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers, Miroslav Benes > The plugin is only used for older versions of GCC. Newer versions have > the same functionality builtin with -fsanitize-coverage=trace-pc. Ok and the frame pointer issue is with the older version only? > > So the problem is GCC. We're using a function attribute which at least > oneGCC developer doesn't recommend. If you want to keep the LTO support > then '__used' seems like a much better choice. I guess __used will work for now. Still would be better to root cause it properly, but I guess that's ok for now. The lack of understanding may eventually come back to bite you later. FWIW i did some tests and I don't see noclone affecting frame pointer with gcc 8. The only thing I saw was that empty functions ended up without frame pointer. I suspect you'll need to fix up the other users of noclone too. -Andi ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-18 2:49 ` Josh Poimboeuf 2018-12-18 4:22 ` Andi Kleen @ 2018-12-18 9:28 ` Miroslav Benes 2018-12-18 12:15 ` Martin Jambor 1 sibling, 1 reply; 32+ messages in thread From: Miroslav Benes @ 2018-12-18 9:28 UTC (permalink / raw) To: Josh Poimboeuf Cc: Andi Kleen, Steven Rostedt, Peter Zijlstra, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers, mjambor On Mon, 17 Dec 2018, Josh Poimboeuf wrote: > On Mon, Dec 17, 2018 at 04:06:18PM -0800, Andi Kleen wrote: > > On Mon, Dec 17, 2018 at 05:36:44PM -0500, Steven Rostedt wrote: > > > On Mon, 17 Dec 2018 15:31:26 -0600 > > > Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > > > > > > On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote: > > > > > On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote: > > > > > > > > > > > > Yes LTO causes the to be treated like static functions. > > > > > > > > > > > > > > I guess noclone is unlikely to be really needed here because these > > > > > > > functions are unlikely to be cloned. > > > > > > > > > > > > > > So as a workaround it could be removed. > > > > > > > > > > > > > > But note we have other noclone functions in the tree (like in KVM) > > > > > > > which actually need it. > > > > > > > > > > > > How about we just use the __used attribute then? It seems to have the > > > > > > same result of preventing IPA optimizations (without the weird side > > > > > > effect of missing frame pointers). > > > > > > > > > > AFAIK we don't have any in-tree LTO, so it can all go in the bin. > > > > > > > > > > When/if we get the LTO trainwreck sorted -- which very much includes > > > > > getting that memory-order-consume fixed -- we can revisit all that. > > > > > > > > Ok, then if there are no objections I'll just send a revert of: > > > > > > > > dd3dad0d716d ("ftrace: Mark function tracer test functions noinline/noclone") > > Sorry for suggesting this prematurely, my email client stopped syncing > and I missed your later replies to Peter about this. > > > > Should it be reverted, or just remove the noclone, and keep the > > > noinline? > > > > It should not be touched for now, until it is properly debugged. > > > > IMHO Josh's explanation doesn't make much sense and there > > was a lot of handwaving > > > > And just fixing one case isn't good enough because there are other > > noclone functions in the tree. > > > > It the problem is the plugin the plugin needs to be fixed. > > > > If the problem is gcc we need a gcc test case and bug, with > > some analysis, and then based on that select the proper workaround. > > The plugin is only used for older versions of GCC. Newer versions have > the same functionality builtin with -fsanitize-coverage=trace-pc. > > So the problem is GCC. We're using a function attribute which at least > oneGCC developer doesn't recommend. If you want to keep the LTO support > then '__used' seems like a much better choice. Martin added to CC. Martin, the thread starts here http://lkml.kernel.org/r/CAK8P3a2K1K21ePBFbApaTKPCk+=Bqj0LyWoK1MdFb1s9ZwjfPg@mail.gmail.com Can you explain the background of noclone vs. used attributes, please? We discussed it yesterday and I understood that maybe we should not rely on noclone that much. However it is used in the kernel. Should we avoid it in general and replace it with something else (used)? It definitely makes sense in our livepatching samples which Josh mentioned previously in the thread. Thanks, Miroslav ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-18 9:28 ` Miroslav Benes @ 2018-12-18 12:15 ` Martin Jambor 2018-12-18 12:31 ` Steven Rostedt ` (2 more replies) 0 siblings, 3 replies; 32+ messages in thread From: Martin Jambor @ 2018-12-18 12:15 UTC (permalink / raw) To: Miroslav Benes, Josh Poimboeuf Cc: Andi Kleen, Steven Rostedt, Peter Zijlstra, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers Hi, On Tue, Dec 18 2018, Miroslav Benes wrote: >> Sorry for suggesting this prematurely, my email client stopped syncing >> and I missed your later replies to Peter about this. >> >> > > Should it be reverted, or just remove the noclone, and keep the >> > > noinline? >> > >> > It should not be touched for now, until it is properly debugged. >> > >> > IMHO Josh's explanation doesn't make much sense and there >> > was a lot of handwaving >> > >> > And just fixing one case isn't good enough because there are other >> > noclone functions in the tree. >> > >> > It the problem is the plugin the plugin needs to be fixed. >> > >> > If the problem is gcc we need a gcc test case and bug, with >> > some analysis, and then based on that select the proper workaround. >> >> The plugin is only used for older versions of GCC. Newer versions have >> the same functionality builtin with -fsanitize-coverage=trace-pc. >> >> So the problem is GCC. That's a little bit harsh :-) >> We're using a function attribute which at least >> oneGCC developer doesn't recommend. If you want to keep the LTO support >> then '__used' seems like a much better choice. > > Martin added to CC. > > Martin, the thread starts here > http://lkml.kernel.org/r/CAK8P3a2K1K21ePBFbApaTKPCk+=Bqj0LyWoK1MdFb1s9ZwjfPg@mail.gmail.com > > Can you explain the background of noclone vs. used attributes, please? > We discussed it yesterday and I understood that maybe we should not rely > on noclone that much. However it is used in the kernel. Should we avoid > it in general and replace it with something else (used)? OK, I have read through it and with the caveats that I don't quite understand what the failure is, that also believe attribute noclone should not affect frame pointer generation, and that I don't quite get how LTO comes into play, my comments are the following: I am the developer who introduced attribute noclone to GCC and also the one who advises against using it :-) ...at least without also using the noinline attribute, the combination means "I want only one or zero copies of this function in the compiled assembly" which you might need if you do fancy stuff in inline assembly, for example. I believe that when people use noclone on its own, in 99 out 100 cases they actually want something else. Usually there is something that references the function from code (such as assembly) or a tool that the compiler does know about and then they should use the "used" attribute. That is what I understood to be the use case here and therefore I recommended it. In other cases people want all inter-procedural (IPA) analyses to stay away from a function and then they should use attribute noipa (or in older GCCs, the combination of noinline and noclone). The attribute was introduced because it is useful in GCC testsuite, and although I think occasions when it is useful on its own elsewhere are very rare and quite obscure, they can happen. But it really only means you want only one or zero *non-inlined* copies of the function. For example if you have an asm in it that must appear in the compiled assembly only once but you are confident it will be optimized out in inlined instances. Or if you play games with __builtin_return_address() and somehow have a very clear idea what the return values should be. I'm afraid I cannot give an opinion what you should do in this case without understanding the problem better. If you can isolate the case where noclone behaves weirdly into a self-contained testcase, I'd be happy to have a look at it. Martin ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-18 12:15 ` Martin Jambor @ 2018-12-18 12:31 ` Steven Rostedt 2018-12-18 14:01 ` Josh Poimboeuf 2018-12-18 21:15 ` Andi Kleen 2 siblings, 0 replies; 32+ messages in thread From: Steven Rostedt @ 2018-12-18 12:31 UTC (permalink / raw) To: Martin Jambor Cc: Miroslav Benes, Josh Poimboeuf, Andi Kleen, Peter Zijlstra, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers On Tue, 18 Dec 2018 13:15:40 +0100 Martin Jambor <mjambor@suse.cz> wrote: > I am the developer who introduced attribute noclone to GCC and also the > one who advises against using it :-) ...at least without also using the > noinline attribute, the combination means "I want only one or zero > copies of this function in the compiled assembly" which you might need > if you do fancy stuff in inline assembly, for example. Sounds to me that we should not be using noclone in this instance. > > I believe that when people use noclone on its own, in 99 out 100 cases > they actually want something else. Usually there is something that > references the function from code (such as assembly) or a tool that the > compiler does know about and then they should use the "used" attribute. > That is what I understood to be the use case here and therefore I > recommended it. In other cases people want all inter-procedural (IPA) > analyses to stay away from a function and then they should use attribute > noipa (or in older GCCs, the combination of noinline and noclone). > > The attribute was introduced because it is useful in GCC testsuite, and > although I think occasions when it is useful on its own elsewhere are > very rare and quite obscure, they can happen. But it really only means > you want only one or zero *non-inlined* copies of the function. For > example if you have an asm in it that must appear in the compiled > assembly only once but you are confident it will be optimized out in > inlined instances. Or if you play games with __builtin_return_address() > and somehow have a very clear idea what the return values should be. With this explanation, I really believe the answer is s/noclone/used/. Here's the problem that the "noinline noclone" is trying to solve. When someone compiles with LTO (which links much more and can remove unused functions, or inline them better) the ftrace dynamic self test functions can be inlined. If that happens, they lose the "mcount/fentry" from gcc -pg. Which was the reason they were made, and placed in a separate file to begin with. Then when the self tests try to enable function tracing on these functions, they can't and the boot up tests fail, disabling ftrace. The "noinline noclone" was added to prevent this case. And from your description, it sounds like "noinline used" is what we actually want. -- Steve ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-18 12:15 ` Martin Jambor 2018-12-18 12:31 ` Steven Rostedt @ 2018-12-18 14:01 ` Josh Poimboeuf 2018-12-18 21:20 ` Andi Kleen 2018-12-19 17:31 ` Martin Jambor 2018-12-18 21:15 ` Andi Kleen 2 siblings, 2 replies; 32+ messages in thread From: Josh Poimboeuf @ 2018-12-18 14:01 UTC (permalink / raw) To: Martin Jambor Cc: Miroslav Benes, Andi Kleen, Steven Rostedt, Peter Zijlstra, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers On Tue, Dec 18, 2018 at 01:15:40PM +0100, Martin Jambor wrote: > OK, I have read through it and with the caveats that I don't quite > understand what the failure is, that also believe attribute noclone > should not affect frame pointer generation, and that I don't quite get > how LTO comes into play, my comments are the following: > > I am the developer who introduced attribute noclone to GCC and also the > one who advises against using it :-) ...at least without also using the > noinline attribute, the combination means "I want only one or zero > copies of this function in the compiled assembly" which you might need > if you do fancy stuff in inline assembly, for example. > > I believe that when people use noclone on its own, in 99 out 100 cases > they actually want something else. Usually there is something that > references the function from code (such as assembly) or a tool that the > compiler does know about and then they should use the "used" attribute. > That is what I understood to be the use case here and therefore I > recommended it. In other cases people want all inter-procedural (IPA) > analyses to stay away from a function and then they should use attribute > noipa (or in older GCCs, the combination of noinline and noclone). > > The attribute was introduced because it is useful in GCC testsuite, and > although I think occasions when it is useful on its own elsewhere are > very rare and quite obscure, they can happen. But it really only means > you want only one or zero *non-inlined* copies of the function. For > example if you have an asm in it that must appear in the compiled > assembly only once but you are confident it will be optimized out in > inlined instances. Or if you play games with __builtin_return_address() > and somehow have a very clear idea what the return values should be. > > I'm afraid I cannot give an opinion what you should do in this case > without understanding the problem better. If you can isolate the case > where noclone behaves weirdly into a self-contained testcase, I'd be > happy to have a look at it. I whittled it down to a small test case. It turns out the problem is caused by the "__optimize__("no-tracer")" atribute, which is used by our __noclone macro: # if __has_attribute(__optimize__) # define __noclone __attribute__((__noclone__, __optimize__("no-tracer"))) # else # define __noclone __attribute__((__noclone__)) # endif Here's the test case. Notice it skips the frame pointer setup before the call to __sanitizer_cov_trace_pc(): $ cat test.c __attribute__((__optimize__("no-tracer"))) int test(void) { return 0; } $ gcc -O2 -fno-omit-frame-pointer -fsanitize-coverage=trace-pc -c -o test.o test.c $ objdump -dr test.o test.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <test>: 0: 48 83 ec 08 sub $0x8,%rsp 4: e8 00 00 00 00 callq 9 <test+0x9> 5: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4 9: 31 c0 xor %eax,%eax b: 48 83 c4 08 add $0x8,%rsp f: c3 retq It works if you remove the function attribute: 0000000000000000 <test>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: e8 00 00 00 00 callq 9 <test+0x9> 5: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4 9: 31 c0 xor %eax,%eax b: 5d pop %rbp c: c3 retq Apparently we are using "no-tracer" because of: commit 95272c29378ee7dc15f43fa2758cb28a5913a06d Author: Paolo Bonzini <pbonzini@redhat.com> Date: Thu Mar 31 09:38:51 2016 +0200 compiler-gcc: disable -ftracer for __noclone functions -ftracer can duplicate asm blocks causing compilation to fail in noclone functions. For example, KVM declares a global variable in an asm like asm("2: ... \n .pushsection data \n .global vmx_return \n vmx_return: .long 2b"); and -ftracer causes a double declaration. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Marek <mmarek@suse.cz> Cc: stable@vger.kernel.org Cc: kvm@vger.kernel.org Reported-by: Linda Walsh <lkml@tlinx.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> -- Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-18 14:01 ` Josh Poimboeuf @ 2018-12-18 21:20 ` Andi Kleen 2018-12-19 3:44 ` Sean Christopherson 2018-12-19 17:31 ` Martin Jambor 1 sibling, 1 reply; 32+ messages in thread From: Andi Kleen @ 2018-12-18 21:20 UTC (permalink / raw) To: Josh Poimboeuf Cc: Martin Jambor, Miroslav Benes, Steven Rostedt, Peter Zijlstra, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers > I whittled it down to a small test case. It turns out the problem is > caused by the "__optimize__("no-tracer")" atribute, which is used by our > __noclone macro: > > > # if __has_attribute(__optimize__) > # define __noclone __attribute__((__noclone__, __optimize__("no-tracer"))) > # else > # define __noclone __attribute__((__noclone__)) > # endif Ah interesting. That makes sense. Thanks for tracking that down. > > commit 95272c29378ee7dc15f43fa2758cb28a5913a06d > Author: Paolo Bonzini <pbonzini@redhat.com> > Date: Thu Mar 31 09:38:51 2016 +0200 > > compiler-gcc: disable -ftracer for __noclone functions > > -ftracer can duplicate asm blocks causing compilation to fail in > noclone functions. For example, KVM declares a global variable > in an asm like > > asm("2: ... \n > .pushsection data \n > .global vmx_return \n > vmx_return: .long 2b"); > > and -ftracer causes a double declaration. Ok. So the real solution would be to add a no tracer to the KVM function only. But of course it will drop its frame pointer. So if you rely on frame pointer for live patching you would still have a problem. Just fixing the ftrace case independently may not be useful? ftrace here really not interesting at all for live patching because the ftracing self test only happens at boot, when there is no live patching. Likely it's even in __init code. KVM on the other hand is quite important and actually happens at boot. So I would concentrate on fixing that one instead, but leave ftrace alone. My suggestion to fix KVM would be to drop the inline assembler and use out of line assembler in a separate file. Then disabling the tracer wouldn't be needed. VM entries are very expensive anyways, I doubt avoiding a single call/ret makes any difference at all. -Andi ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-18 21:20 ` Andi Kleen @ 2018-12-19 3:44 ` Sean Christopherson 0 siblings, 0 replies; 32+ messages in thread From: Sean Christopherson @ 2018-12-19 3:44 UTC (permalink / raw) To: Andi Kleen Cc: Josh Poimboeuf, Martin Jambor, Miroslav Benes, Steven Rostedt, Peter Zijlstra, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers, Paolo Bonzini, Nadav Amit +cc Nadav and Paolo On Tue, Dec 18, 2018 at 01:20:42PM -0800, Andi Kleen wrote: > > I whittled it down to a small test case. It turns out the problem is > > caused by the "__optimize__("no-tracer")" atribute, which is used by our > > __noclone macro: > > > > > > # if __has_attribute(__optimize__) > > # define __noclone __attribute__((__noclone__, __optimize__("no-tracer"))) > > # else > > # define __noclone __attribute__((__noclone__)) > > # endif > > Ah interesting. That makes sense. Thanks for tracking that down. > > > > > commit 95272c29378ee7dc15f43fa2758cb28a5913a06d > > Author: Paolo Bonzini <pbonzini@redhat.com> > > Date: Thu Mar 31 09:38:51 2016 +0200 > > > > compiler-gcc: disable -ftracer for __noclone functions > > > > -ftracer can duplicate asm blocks causing compilation to fail in > > noclone functions. For example, KVM declares a global variable > > in an asm like > > > > asm("2: ... \n > > .pushsection data \n > > .global vmx_return \n > > vmx_return: .long 2b"); > > > > and -ftracer causes a double declaration. > > Ok. So the real solution would be to add a no tracer to the KVM > function only. > > But of course it will drop its frame pointer. So if you rely on > frame pointer for live patching you would still have a problem. > > Just fixing the ftrace case independently may not be useful? > > ftrace here really not interesting at all for live patching because > the ftracing self test only happens at boot, when there is no live patching. > Likely it's even in __init code. > > KVM on the other hand is quite important and actually happens at boot. > > So I would concentrate on fixing that one instead, but leave ftrace > alone. > > My suggestion to fix KVM would be to drop the inline assembler > and use out of line assembler in a separate file. Then disabling > the tracer wouldn't be needed. > > VM entries are very expensive anyways, I doubt avoiding a single call/ret > makes any difference at all. The KVM code in question isn't inline asm for performance, but rather so that it can reference struct offsets when marshalling data between the CPU and software structs when loading/saving guest state. The structs in question are local to VMX, i.e. don't belong in asm-offsets.h, and until now there hasn't been a need to move away from inline asm. Nadav recently pointed out that the __noclone attribute results in vmx_vcpu_run() failing to inline even trivial code. So now there's two reasons to get rid of __noclone on vmx_vcpu_run(). But rather than move the entire asm blob wholesale into a separate asm file, we can move just VMLAUNCH+VMRESUME along with a minimal amount of wrapper code. I have working code, just need to write a changelog. I'll send a series tomorrow for the VMX change and a revert of commit 95272c29378e ("compiler-gcc: disable -ftracer for __noclone functions"). [1] https://patchwork.kernel.org/patch/8707981/#21817015 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-18 14:01 ` Josh Poimboeuf 2018-12-18 21:20 ` Andi Kleen @ 2018-12-19 17:31 ` Martin Jambor 1 sibling, 0 replies; 32+ messages in thread From: Martin Jambor @ 2018-12-19 17:31 UTC (permalink / raw) To: Josh Poimboeuf Cc: Miroslav Benes, Andi Kleen, Steven Rostedt, Peter Zijlstra, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers Hi, On Tue, Dec 18 2018, Josh Poimboeuf wrote: > On Tue, Dec 18, 2018 at 01:15:40PM +0100, Martin Jambor wrote: >> I'm afraid I cannot give an opinion what you should do in this case >> without understanding the problem better. If you can isolate the case >> where noclone behaves weirdly into a self-contained testcase, I'd be >> happy to have a look at it. > > I whittled it down to a small test case. It turns out the problem is > caused by the "__optimize__("no-tracer")" atribute, which is used by our > __noclone macro: Wonderful, thanks a lot. > > # if __has_attribute(__optimize__) > # define __noclone __attribute__((__noclone__, __optimize__("no-tracer"))) > # else > # define __noclone __attribute__((__noclone__)) > # endif > > > Here's the test case. Notice it skips the frame pointer setup before > the call to __sanitizer_cov_trace_pc(): > > > $ cat test.c > __attribute__((__optimize__("no-tracer"))) int test(void) > { > return 0; > } > $ gcc -O2 -fno-omit-frame-pointer -fsanitize-coverage=trace-pc -c -o test.o test.c Well, it might not be intuitive but the __optimize__ attribute resets to -O2 optimization defaults and thus drops -fno-omit-frame-pointer on the floor, so no wonder there is no frame pointer. This applies to other optimization options you pass on the command line, for example if you still compile kernel with -fno-strict-aliasing, you get GCC assuming strict aliasing in functions marked with your __noclone. > Apparently we are using "no-tracer" because of: > > > commit 95272c29378ee7dc15f43fa2758cb28a5913a06d > Author: Paolo Bonzini <pbonzini@redhat.com> > Date: Thu Mar 31 09:38:51 2016 +0200 > > compiler-gcc: disable -ftracer for __noclone functions > > -ftracer can duplicate asm blocks causing compilation to fail in > noclone functions. For example, KVM declares a global variable > in an asm like > > asm("2: ... \n > .pushsection data \n > .global vmx_return \n > vmx_return: .long 2b"); > > and -ftracer causes a double declaration. There is no way for a user to reliably prevent a duplication of an inline assembly statement. The correct fix is to move such statements to an assembly files. You have disabled tracer which duplicated it in your case but other passes might do that too in the future. The correct fix is to put such assembler snippets into separate assembly files or, if you for for some reason insist on inline assembly, to use '%=' inline assembler template format string (it is expanded to a number that is unique in a compilation unit). Martin ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-18 12:15 ` Martin Jambor 2018-12-18 12:31 ` Steven Rostedt 2018-12-18 14:01 ` Josh Poimboeuf @ 2018-12-18 21:15 ` Andi Kleen 2018-12-18 21:57 ` Steven Rostedt 2018-12-19 17:38 ` Martin Jambor 2 siblings, 2 replies; 32+ messages in thread From: Andi Kleen @ 2018-12-18 21:15 UTC (permalink / raw) To: Martin Jambor Cc: Miroslav Benes, Josh Poimboeuf, Steven Rostedt, Peter Zijlstra, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers > OK, I have read through it and with the caveats that I don't quite > understand what the failure is, that also believe attribute noclone > should not affect frame pointer generation, and that I don't quite get > how LTO comes into play, my comments are the following: > > I am the developer who introduced attribute noclone to GCC and also the > one who advises against using it :-) ...at least without also using the > noinline attribute, the combination means " The function in question uses noinline too. > I want only one or zero > copies of this function in the compiled assembly" which you might need > if you do fancy stuff in inline assembly, for example. For this case we only want one non inlined copy because it is used as a test case for a function tracer. LTO comes into play because it originally relied on being in a separate file, so it would not be inlined, but with LTO that doesn't work. > > I believe that when people use noclone on its own, in 99 out 100 cases > they actually want something else. Usually there is something that AFAIK there is no noclone without noinline in the kernel tree. > references the function from code (such as assembly) or a tool that the > compiler does know about and then they should use the "used" attribute. Neither in the ftrace case, nor in the KVM case (another user which has fancy inline assembly that cannot be duplicated) that's the case. It's just about having exactly one out of line instance. So based on that I think noclone is fine. Of course there is still the open question why exactly the frame pointer disappears. -Andi ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-18 21:15 ` Andi Kleen @ 2018-12-18 21:57 ` Steven Rostedt 2018-12-18 22:13 ` Andi Kleen 2018-12-19 17:38 ` Martin Jambor 1 sibling, 1 reply; 32+ messages in thread From: Steven Rostedt @ 2018-12-18 21:57 UTC (permalink / raw) To: Andi Kleen Cc: Martin Jambor, Miroslav Benes, Josh Poimboeuf, Peter Zijlstra, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers On Tue, 18 Dec 2018 13:15:01 -0800 Andi Kleen <ak@linux.intel.com> wrote: > > > > > I am the developer who introduced attribute noclone to GCC and also the > > one who advises against using it :-) ...at least without also using the > > noinline attribute, the combination means " > > The function in question uses noinline too. And that's all I care about. > > > I want only one or zero > > copies of this function in the compiled assembly" which you might need > > if you do fancy stuff in inline assembly, for example. > > For this case we only want one non inlined copy because it is used as a > test case for a function tracer. The function tracer selftest should work just fine if there's more than one copy. Because the test will enable all copies. > > LTO comes into play because it originally relied on being in a separate > file, so it would not be inlined, but with LTO that doesn't work. The reason for it being in a separate file is because the entire directory is marked "notrace". That file, and that file alone in that directory, gets compiled with the -pg flags. As long as the functions in that file don't get inlined (which will make them notrace as well) all is good. Thus, if used and/or noinline prevents the functions from being inlined, and thus have their profiling removed, we should be good. Hmm, how does that work? When does LTO do its linker magic? Because the fentry/mcounts are added when the object is created. Are they removed if the compiler sees that it can be inlined? Or does LTO just compile everything in one go? > > > > > I believe that when people use noclone on its own, in 99 out 100 cases > > they actually want something else. Usually there is something that > > AFAIK there is no noclone without noinline in the kernel tree. > > > > references the function from code (such as assembly) or a tool that the > > compiler does know about and then they should use the "used" attribute. > > Neither in the ftrace case, nor in the KVM case (another user which > has fancy inline assembly that cannot be duplicated) that's the case. > It's just about having exactly one out of line instance. Again, that's not the ftrace case. It doesn't care about more than one out of line instance. Thus, for this particular use, "used" should be good enough. -- Steve ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-18 21:57 ` Steven Rostedt @ 2018-12-18 22:13 ` Andi Kleen 2018-12-18 22:16 ` Steven Rostedt 0 siblings, 1 reply; 32+ messages in thread From: Andi Kleen @ 2018-12-18 22:13 UTC (permalink / raw) To: Steven Rostedt Cc: Martin Jambor, Miroslav Benes, Josh Poimboeuf, Peter Zijlstra, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers On Tue, Dec 18, 2018 at 04:57:13PM -0500, Steven Rostedt wrote: > Hmm, how does that work? When does LTO do its linker magic? Because the > fentry/mcounts are added when the object is created. Are they removed > if the compiler sees that it can be inlined? Or does LTO just compile > everything in one go? LTO compiles everything in one go at link time. The objects just contain immediate code. Also in principle it should track command line options from the command line and apply them by function, but there were bugs regarding this in older gcc versions so it may not always work. > Again, that's not the ftrace case. It doesn't care about more than one > out of line instance. Thus, for this particular use, "used" should be > good enough. You mean noinline used? Inlining would certainly break the test. -Andi ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-18 22:13 ` Andi Kleen @ 2018-12-18 22:16 ` Steven Rostedt 2018-12-18 23:26 ` Andi Kleen 0 siblings, 1 reply; 32+ messages in thread From: Steven Rostedt @ 2018-12-18 22:16 UTC (permalink / raw) To: Andi Kleen Cc: Martin Jambor, Miroslav Benes, Josh Poimboeuf, Peter Zijlstra, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers On Tue, 18 Dec 2018 14:13:38 -0800 Andi Kleen <ak@linux.intel.com> wrote: > > Again, that's not the ftrace case. It doesn't care about more than one > > out of line instance. Thus, for this particular use, "used" should be > > good enough. > > You mean noinline used? I thought that someone said that "used" would also prevent inlining. But regardless, whatever it takes to not let the functions be inlined is indeed good enough. -- Steve > > Inlining would certainly break the test. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-18 22:16 ` Steven Rostedt @ 2018-12-18 23:26 ` Andi Kleen 2018-12-18 23:40 ` Steven Rostedt 0 siblings, 1 reply; 32+ messages in thread From: Andi Kleen @ 2018-12-18 23:26 UTC (permalink / raw) To: Steven Rostedt Cc: Martin Jambor, Miroslav Benes, Josh Poimboeuf, Peter Zijlstra, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers On Tue, Dec 18, 2018 at 05:16:20PM -0500, Steven Rostedt wrote: > On Tue, 18 Dec 2018 14:13:38 -0800 > Andi Kleen <ak@linux.intel.com> wrote: > > > > Again, that's not the ftrace case. It doesn't care about more than one > > > out of line instance. Thus, for this particular use, "used" should be > > > good enough. > > > > You mean noinline used? > > I thought that someone said that "used" would also prevent inlining. that's not correct. You need noinline -Andi [ak@tassilo tsrc]$ cat tinline.c int i; inline __attribute__((used)) int finline(void) { i++; } main() { finline(); } [ak@tassilo tsrc]$ gcc -O2 -S tinline.c tinline.c:10:1: warning: return type defaults to ‘int’ [-Wimplicit-int] main() ^~~~ [ak@tassilo tsrc]$ [ak@tassilo tsrc]$ cat tinline.s .file "tinline.c" .text .section .text.startup,"ax",@progbits .p2align 4,,15 .globl main .type main, @function main: .LFB1: .cfi_startproc addl $1, i(%rip) xorl %eax, %eax ret .cfi_endproc .LFE1: .size main, .-main .comm i,4,4 .ident "GCC: (GNU) 8.2.1 20181105 (Red Hat 8.2.1-5)" .section .note.GNU-stack,"",@progbits [ak@tassilo tsrc]$ ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-18 23:26 ` Andi Kleen @ 2018-12-18 23:40 ` Steven Rostedt 0 siblings, 0 replies; 32+ messages in thread From: Steven Rostedt @ 2018-12-18 23:40 UTC (permalink / raw) To: Andi Kleen Cc: Martin Jambor, Miroslav Benes, Josh Poimboeuf, Peter Zijlstra, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers On Tue, 18 Dec 2018 15:26:36 -0800 Andi Kleen <ak@linux.intel.com> wrote: > > I thought that someone said that "used" would also prevent inlining. > > that's not correct. You need noinline Sure, whatever. That's besides the point. The thing is, I don't need noclone. Just remove that and keep the noiniline, and if that keeps the functions in that file still able to be function traced, then I'm happy. I was only explaining why I mentioned "used" in the previous email. If you read the part that you cut off in your reply, I said "But regardless, whatever it takes to not let the functions be inlined is indeed good enough." I don't care if that's "used" on "noinline", you didn't need to waste your time to post the assembly. -- Steve ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-18 21:15 ` Andi Kleen 2018-12-18 21:57 ` Steven Rostedt @ 2018-12-19 17:38 ` Martin Jambor 1 sibling, 0 replies; 32+ messages in thread From: Martin Jambor @ 2018-12-19 17:38 UTC (permalink / raw) To: Andi Kleen Cc: Miroslav Benes, Josh Poimboeuf, Steven Rostedt, Peter Zijlstra, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers Hi, On Tue, Dec 18 2018, Andi Kleen wrote: >> OK, I have read through it and with the caveats that I don't quite >> understand what the failure is, that also believe attribute noclone >> should not affect frame pointer generation, and that I don't quite get >> how LTO comes into play, my comments are the following: > >> >> I am the developer who introduced attribute noclone to GCC and also the >> one who advises against using it :-) ...at least without also using the >> noinline attribute, the combination means " > > The function in question uses noinline too. > >> I want only one or zero >> copies of this function in the compiled assembly" which you might need >> if you do fancy stuff in inline assembly, for example. > > For this case we only want one non inlined copy because it is used as a > test case for a function tracer. > > LTO comes into play because it originally relied on being in a separate > file, so it would not be inlined, but with LTO that doesn't work. > >> >> I believe that when people use noclone on its own, in 99 out 100 cases >> they actually want something else. Usually there is something that > > AFAIK there is no noclone without noinline in the kernel tree. > > >> references the function from code (such as assembly) or a tool that the >> compiler does know about and then they should use the "used" attribute. > > Neither in the ftrace case, nor in the KVM case (another user which > has fancy inline assembly that cannot be duplicated) that's the case. > It's just about having exactly one out of line instance. > > So based on that I think noclone is fine. Of course there > is still the open question why exactly the frame pointer disappears. I agree, I originally thought the problem was something else. Thanks for the clarification, Martin ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-17 22:36 ` Steven Rostedt 2018-12-18 0:06 ` Andi Kleen @ 2018-12-18 3:05 ` Josh Poimboeuf 1 sibling, 0 replies; 32+ messages in thread From: Josh Poimboeuf @ 2018-12-18 3:05 UTC (permalink / raw) To: Steven Rostedt Cc: Peter Zijlstra, Andi Kleen, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers, Miroslav Benes On Mon, Dec 17, 2018 at 05:36:44PM -0500, Steven Rostedt wrote: > On Mon, 17 Dec 2018 15:31:26 -0600 > Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > > On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote: > > > On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote: > > > > > > > > Yes LTO causes the to be treated like static functions. > > > > > > > > > > I guess noclone is unlikely to be really needed here because these > > > > > functions are unlikely to be cloned. > > > > > > > > > > So as a workaround it could be removed. > > > > > > > > > > But note we have other noclone functions in the tree (like in KVM) > > > > > which actually need it. > > > > > > > > How about we just use the __used attribute then? It seems to have the > > > > same result of preventing IPA optimizations (without the weird side > > > > effect of missing frame pointers). > > > > > > AFAIK we don't have any in-tree LTO, so it can all go in the bin. > > > > > > When/if we get the LTO trainwreck sorted -- which very much includes > > > getting that memory-order-consume fixed -- we can revisit all that. > > > > Ok, then if there are no objections I'll just send a revert of: > > > > dd3dad0d716d ("ftrace: Mark function tracer test functions noinline/noclone") > > > > Should it be reverted, or just remove the noclone, and keep the > noinline? If we want to support out-of-tree LTO, then it should only need "used", because wouldn't that imply noinline? If we *don't* want to support out-of-tree LTO, then it shouldn't need any special attributes, because the functions aren't static so GCC won't mess with their ABI. -- Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-17 21:31 ` Josh Poimboeuf 2018-12-17 22:36 ` Steven Rostedt @ 2018-12-17 23:54 ` Andi Kleen 1 sibling, 0 replies; 32+ messages in thread From: Andi Kleen @ 2018-12-17 23:54 UTC (permalink / raw) To: Josh Poimboeuf Cc: Peter Zijlstra, Arnd Bergmann, Linux Kernel Mailing List, the arch/x86 maintainers, Steven Rostedt, Miroslav Benes On Mon, Dec 17, 2018 at 03:31:26PM -0600, Josh Poimboeuf wrote: > On Mon, Dec 17, 2018 at 08:29:38PM +0100, Peter Zijlstra wrote: > > On Mon, Dec 17, 2018 at 12:16:38PM -0600, Josh Poimboeuf wrote: > > > > > > Yes LTO causes the to be treated like static functions. > > > > > > > > I guess noclone is unlikely to be really needed here because these > > > > functions are unlikely to be cloned. > > > > > > > > So as a workaround it could be removed. > > > > > > > > But note we have other noclone functions in the tree (like in KVM) > > > > which actually need it. > > > > > > How about we just use the __used attribute then? It seems to have the > > > same result of preventing IPA optimizations (without the weird side > > > effect of missing frame pointers). > > > > AFAIK we don't have any in-tree LTO, so it can all go in the bin. > > > > When/if we get the LTO trainwreck sorted -- which very much includes > > getting that memory-order-consume fixed -- we can revisit all that. > > Ok, then if there are no objections I'll just send a revert of: > > dd3dad0d716d ("ftrace: Mark function tracer test functions noinline/noclone") I object to that. -Andi ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: objtool warnings for kernel/trace/trace_selftest_dynamic.o 2018-12-17 18:16 ` Josh Poimboeuf 2018-12-17 19:29 ` Peter Zijlstra @ 2018-12-17 21:03 ` Andi Kleen 1 sibling, 0 replies; 32+ messages in thread From: Andi Kleen @ 2018-12-17 21:03 UTC (permalink / raw) To: Josh Poimboeuf Cc: Arnd Bergmann, Peter Zijlstra, Linux Kernel Mailing List, the arch/x86 maintainers, Steven Rostedt, Miroslav Benes > > That seems weird. > > > > Are you sure it's not just because they are empty? AFAIK > > gcc doesn't necessarily generate frame pointers for empty functions. > > I suspected that it was because they're empty, however I didn't see this > warning for other leaf functions. The sancov plugin is presumably > taking care of adding frame pointers where needed. Also, adding That would surprise me. > -mno-omit-leaf-frame-pointer didn't fix it. > > And anyway I confirmed that it was fixed by removing __noclone. So this is with a plugin? Maybe the plugin does something wrong? I thought this was just a standard build. I'm not sure the problem is well enough understood yet to really do anything. Do you have a simple standalone test case to show the compiler behavior? -Andi ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2018-12-19 17:38 UTC | newest] Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-12-16 18:33 objtool warnings for kernel/trace/trace_selftest_dynamic.o Arnd Bergmann 2018-12-17 17:39 ` Josh Poimboeuf 2018-12-17 18:04 ` Andi Kleen 2018-12-17 18:16 ` Josh Poimboeuf 2018-12-17 19:29 ` Peter Zijlstra 2018-12-17 20:55 ` Andi Kleen 2018-12-17 22:35 ` Peter Zijlstra 2018-12-17 23:59 ` Andi Kleen 2018-12-18 9:19 ` Peter Zijlstra 2018-12-18 21:22 ` Andi Kleen 2018-12-17 21:31 ` Josh Poimboeuf 2018-12-17 22:36 ` Steven Rostedt 2018-12-18 0:06 ` Andi Kleen 2018-12-18 2:49 ` Josh Poimboeuf 2018-12-18 4:22 ` Andi Kleen 2018-12-18 9:28 ` Miroslav Benes 2018-12-18 12:15 ` Martin Jambor 2018-12-18 12:31 ` Steven Rostedt 2018-12-18 14:01 ` Josh Poimboeuf 2018-12-18 21:20 ` Andi Kleen 2018-12-19 3:44 ` Sean Christopherson 2018-12-19 17:31 ` Martin Jambor 2018-12-18 21:15 ` Andi Kleen 2018-12-18 21:57 ` Steven Rostedt 2018-12-18 22:13 ` Andi Kleen 2018-12-18 22:16 ` Steven Rostedt 2018-12-18 23:26 ` Andi Kleen 2018-12-18 23:40 ` Steven Rostedt 2018-12-19 17:38 ` Martin Jambor 2018-12-18 3:05 ` Josh Poimboeuf 2018-12-17 23:54 ` Andi Kleen 2018-12-17 21:03 ` Andi Kleen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).