* Additional debug info to aid cacheline analysis @ 2020-10-06 13:17 Peter Zijlstra 2020-10-06 19:00 ` Arnaldo Carvalho de Melo 2020-10-08 5:58 ` Stephane Eranian 0 siblings, 2 replies; 27+ messages in thread From: Peter Zijlstra @ 2020-10-06 13:17 UTC (permalink / raw) To: linux-toolchains, Stephane Eranian, Arnaldo Carvalho de Melo Cc: linux-kernel, Ingo Molnar, Jiri Olsa, namhyung, irogers, kim.phillips, Mark Rutland Hi all, I've been trying to float this idea for a fair number of years, and I think at least Stephane has been talking to tools people about it, but I'm not sure what, if anything, ever happened with it, so let me post it here :-) Basically, what I want is a (perf) tool for cacheline optimizations. Something very much like the excellent pahole tool, but with hit/miss information added. Now, some PMUs provide the data address for various relevant events, but that gets us the problem of mapping a 'random' address to a type and offset. And esp. for dynamic objects, that's a difficult problem. However, the compiler actually knows what type and offset (most) memory references are, so if perf can get us the exact IP (Intel PEBS / AMD IBS, as opposed to one with skid on) we could get the type from debug info. And therein lies the rub, existing debug info (DWARF) does contain type information, but in a way that is (I've been told) _very_ hard to use for this purpose. So could the compiler emit extra debug info for every instruction with a memory reference on to facilitate this? ~ Peter ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Additional debug info to aid cacheline analysis 2020-10-06 13:17 Additional debug info to aid cacheline analysis Peter Zijlstra @ 2020-10-06 19:00 ` Arnaldo Carvalho de Melo 2020-10-08 5:58 ` Stephane Eranian 1 sibling, 0 replies; 27+ messages in thread From: Arnaldo Carvalho de Melo @ 2020-10-06 19:00 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-toolchains, Stephane Eranian, linux-kernel, Ingo Molnar, Jiri Olsa, namhyung, irogers, kim.phillips, Mark Rutland, andrii Em Tue, Oct 06, 2020 at 03:17:03PM +0200, Peter Zijlstra escreveu: > Hi all, > I've been trying to float this idea for a fair number of years, and I > think at least Stephane has been talking to tools people about it, but > I'm not sure what, if anything, ever happened with it, so let me post it > here :-) > Basically, what I want is a (perf) tool for cacheline optimizations. > Something very much like the excellent pahole tool, but with hit/miss > information added. > Now, some PMUs provide the data address for various relevant events, but > that gets us the problem of mapping a 'random' address to a type and > offset. And esp. for dynamic objects, that's a difficult problem. > However, the compiler actually knows what type and offset (most) memory > references are, so if perf can get us the exact IP (Intel PEBS / AMD > IBS, as opposed to one with skid on) we could get the type from debug > info. > And therein lies the rub, existing debug info (DWARF) does contain type > information, but in a way that is (I've been told) _very_ hard to use > for this purpose. > So could the compiler emit extra debug info for every instruction with a > memory reference on to facilitate this? I guess this is what is done to enable CO-RE, there you have to mark areas of interest, i.e. in your program you enclose access to fields of kernel data structures you use in your BPF program so that when loading it libbpf can check at the fields used in your program and in the kernel (/sys/kernel/btf/vmlinux) and figure out if those fields moved, then it fixes up the offsets from the start of the struct. You want those relocation records for all types in the kernel, not to fixup things, but to figure out that some load or store in some struct member is for a type. https://facebookmicrosites.github.io/bpf/blog/2020/02/19/bpf-portability-and-co-re.html <quote> Compiler support To enable BPF CO-RE and let BPF loader (i.e., libbpf) to adjust BPF program to a particular kernel running on target host, Clang was extended with few built-ins. They emit BTF relocations which capture a high-level description of what pieces of information BPF program code intended to read. If you were going to access task_struct->pid field, Clang would record that it was exactly a field named "pid" of type “pid_t” residing within a struct task_struct. This is done so that even if target kernel has a task_struct layout in which “pid” field got moved to a different offset within a task_struct structure (e.g., due to extra field added before “pid” field), or even if it was moved into some nested anonymous struct or union (and this is completely transparent in C code, so no one ever pays attention to details like that), we’ll still be able to find it just by its name and type information. This is called a field offset relocation. It is possible to capture (and subsequently relocate) not just a field offset, but other field aspects, like field existence or size. Even for bitfields (which are notoriously "uncooperative" kinds of data in the C language, resisting efforts to make them relocatable) it is still possible to capture enough information to make them relocatable, all transparently to BPF program developer. </quote> <quote> High-level BPF CO-RE mechanics BPF CO-RE brings together necessary pieces of functionality and data at all levels of the software stack: kernel, user-space BPF loader library (libbpf), and compiler (Clang) – to make it possible and easy to write BPF programs in a portable manner, handling discrepancies between different kernels within the same pre-compiled BPF program. BPF CO-RE requires a careful integration and cooperation of the following components: BTF type information, which allows to capture crucial pieces of information about kernel and BPF program types and code, enabling all the other parts of BPF CO-RE puzzle; compiler (Clang) provides means for BPF program C code to express the intent and record relocation information; BPF loader (libbpf) ties BTFs from kernel and BPF program together to adjust compiled BPF code to specific kernel on target hosts; kernel, while staying completely BPF CO-RE-agnostic, provides advanced BPF features to enable some of the more advanced scenarios. Working in ensemble, these components enable unprecedented ability to develop portable BPF programs with ease, adaptability, and expressivity, previously achievable only through compiling BPF program’s C code in runtime through BCC, but without paying a high price of the BCC way. </quote> - Arnaldo ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Additional debug info to aid cacheline analysis 2020-10-06 13:17 Additional debug info to aid cacheline analysis Peter Zijlstra 2020-10-06 19:00 ` Arnaldo Carvalho de Melo @ 2020-10-08 5:58 ` Stephane Eranian 2020-10-08 7:02 ` Peter Zijlstra 1 sibling, 1 reply; 27+ messages in thread From: Stephane Eranian @ 2020-10-08 5:58 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-toolchains, Arnaldo Carvalho de Melo, linux-kernel, Ingo Molnar, Jiri Olsa, Namhyung Kim, Ian Rogers, Phillips, Kim, Mark Rutland Hi Peter, On Tue, Oct 6, 2020 at 6:17 AM Peter Zijlstra <peterz@infradead.org> wrote: > > Hi all, > > I've been trying to float this idea for a fair number of years, and I > think at least Stephane has been talking to tools people about it, but > I'm not sure what, if anything, ever happened with it, so let me post it > here :-) > > Thanks for bringing this back. This is a pet project of mine and I have been looking at it for the last 4 years intermittently now. Simply never got a chance to complete because preempted by other higher priority projects. I have developed an internal proof-of-concept prototype using one of the 3 approaches I know. My goal was to demonstrate that PMU statistical sampling of loads/stores and with data addresses would work as well as instrumentation. This is slightly different from hit/miss in the analysis but the process is the same. As you point out, the difficulty is not so much in collecting the sample but rather in symbolizing data addresses from the heap. Intel PEBS, IBM Marked Events work well to collect the data. AMD IBS works though you get a lot of irrelevant samples due to lack of hardware filtering. ARM SPE would work too. Overall, all the major architectures will provide the sampling support needed. Some time ago, I had my intern pursue the other 2 approaches for symbolization. The one I see as most promising is by using the DWARF information (no BPF needed). The good news is that I believe we do not need more information than what is already there. We just need the compiler to generate valid DWARF at most optimization levels, which I believe is not the case for LLVM based compilers but maybe okay for GCC. Once we have the DWARF logic in place then it is easier to improve perf report/annotate do to hit/miss or hot/cold, read/write analysis on each data type and fields within. Once we have the code for perf, we are planning to contribute it upstream. In the meantime, we need to lean on the compiler teams to ensure no data type information is lost with high optimizations levels. My understanding from talking with some compiler folks is that this is not a trivial fix. > Basically, what I want is a (perf) tool for cacheline optimizations. > Something very much like the excellent pahole tool, but with hit/miss > information added. > > Now, some PMUs provide the data address for various relevant events, but > that gets us the problem of mapping a 'random' address to a type and > offset. And esp. for dynamic objects, that's a difficult problem. > > However, the compiler actually knows what type and offset (most) memory > references are, so if perf can get us the exact IP (Intel PEBS / AMD > IBS, as opposed to one with skid on) we could get the type from debug > info. > > And therein lies the rub, existing debug info (DWARF) does contain type > information, but in a way that is (I've been told) _very_ hard to use > for this purpose. > > So could the compiler emit extra debug info for every instruction with a > memory reference on to facilitate this? > > ~ Peter ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Additional debug info to aid cacheline analysis 2020-10-08 5:58 ` Stephane Eranian @ 2020-10-08 7:02 ` Peter Zijlstra 2020-10-08 9:32 ` Mark Wielaard 0 siblings, 1 reply; 27+ messages in thread From: Peter Zijlstra @ 2020-10-08 7:02 UTC (permalink / raw) To: Stephane Eranian Cc: linux-toolchains, Arnaldo Carvalho de Melo, linux-kernel, Ingo Molnar, Jiri Olsa, Namhyung Kim, Ian Rogers, Phillips, Kim, Mark Rutland, Andi Kleen, Masami Hiramatsu My appologies for adding a typo to the linux-kernel address, corrected now. On Wed, Oct 07, 2020 at 10:58:00PM -0700, Stephane Eranian wrote: > Hi Peter, > > On Tue, Oct 6, 2020 at 6:17 AM Peter Zijlstra <peterz@infradead.org> wrote: > > > > Hi all, > > > > I've been trying to float this idea for a fair number of years, and I > > think at least Stephane has been talking to tools people about it, but > > I'm not sure what, if anything, ever happened with it, so let me post it > > here :-) > > > > > Thanks for bringing this back. This is a pet project of mine and I > have been looking at it for the last 4 years intermittently now. > Simply never got a chance to complete because preempted by other > higher priority projects. I have developed an internal > proof-of-concept prototype using one of the 3 approaches I know. My > goal was to demonstrate that PMU statistical sampling of loads/stores > and with data addresses would work as well as instrumentation. This is > slightly different from hit/miss in the analysis but the process is > the same. > > As you point out, the difficulty is not so much in collecting the > sample but rather in symbolizing data addresses from the heap. Right, that's non-trivial, although for static and per-cpu objects it should be rather straight forward, heap objects are going to be a pain. You'd basically have to also log the alloc/free of every object along with the data type used for it, which is not something we have readily abailable at the allocator. > Intel PEBS, IBM Marked Events work well to collect the data. AMD IBS > works though you get a lot of irrelevant samples due to lack of > hardware filtering. ARM SPE would work too. Overall, all the major > architectures will provide the sampling support needed. That's for the data address, or also the eventing IP? > Some time ago, I had my intern pursue the other 2 approaches for > symbolization. The one I see as most promising is by using the DWARF > information (no BPF needed). The good news is that I believe we do not > need more information than what is already there. We just need the > compiler to generate valid DWARF at most optimization levels, which I > believe is not the case for LLVM based compilers but maybe okay for > GCC. Right, I think GCC improved a lot on this front over the past few years. Also added Andi and Masami, who have worked on this or related topics. > Once we have the DWARF logic in place then it is easier to improve > perf report/annotate do to hit/miss or hot/cold, read/write analysis > on each data type and fields within. > > Once we have the code for perf, we are planning to contribute it upstream. > > In the meantime, we need to lean on the compiler teams to ensure no > data type information is lost with high optimizations levels. My > understanding from talking with some compiler folks is that this is > not a trivial fix. As you might have noticed, I send this to the linux-toolchains list. While you lean on your copmiler folks, try and get them subscribed to this list. It is meant to discuss toolchain issues as related to Linux. Both GCC/binutils and LLVM should be represented here. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Additional debug info to aid cacheline analysis 2020-10-08 7:02 ` Peter Zijlstra @ 2020-10-08 9:32 ` Mark Wielaard 2020-10-08 21:23 ` Andi Kleen 2020-10-30 5:26 ` Namhyung Kim 0 siblings, 2 replies; 27+ messages in thread From: Mark Wielaard @ 2020-10-08 9:32 UTC (permalink / raw) To: Peter Zijlstra, Stephane Eranian Cc: linux-toolchains, Arnaldo Carvalho de Melo, linux-kernel, Ingo Molnar, Jiri Olsa, Namhyung Kim, Ian Rogers, Phillips, Kim, Mark Rutland, Andi Kleen, Masami Hiramatsu Hi, On Thu, 2020-10-08 at 09:02 +0200, Peter Zijlstra wrote: > Some time ago, I had my intern pursue the other 2 approaches for > > symbolization. The one I see as most promising is by using the DWARF > > information (no BPF needed). The good news is that I believe we do not > > need more information than what is already there. We just need the > > compiler to generate valid DWARF at most optimization levels, which I > > believe is not the case for LLVM based compilers but maybe okay for > > GCC. > > Right, I think GCC improved a lot on this front over the past few years. > Also added Andi and Masami, who have worked on this or related topics. For GCC Alexandre Oliva did a really thorough write up of all the various optimization and their effect on debugging/DWARF: https://www.fsfla.org/~lxoliva/writeups/gOlogy/gOlogy.html GCC using -fvar-tracking and -fvar-tracking-assignments is pretty good at keeping track of where variables are held (in memory or registers) when in the program, even through various optimizations. -fvar-tracking-assignments is the default with -g -O2. Except for the upstream linux kernel code. Most distros enable it again, but you do want to enable it by hand when building from the upstream linux git repo. Basically you simply want to remove this line in the top-level Makefile: DEBUG_CFLAGS := $(call cc-option, -fno-var-tracking-assignments) Cheers, Mark ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Additional debug info to aid cacheline analysis 2020-10-08 9:32 ` Mark Wielaard @ 2020-10-08 21:23 ` Andi Kleen 2020-10-10 20:58 ` Mark Wielaard 2020-10-30 5:26 ` Namhyung Kim 1 sibling, 1 reply; 27+ messages in thread From: Andi Kleen @ 2020-10-08 21:23 UTC (permalink / raw) To: Mark Wielaard Cc: Peter Zijlstra, Stephane Eranian, linux-toolchains, Arnaldo Carvalho de Melo, linux-kernel, Ingo Molnar, Jiri Olsa, Namhyung Kim, Ian Rogers, Phillips, Kim, Mark Rutland, Andi Kleen, Masami Hiramatsu > Basically you simply want to remove this line in the top-level > Makefile: > > DEBUG_CFLAGS := $(call cc-option, -fno-var-tracking-assignments) It looks like this was needed as a workaround for a gcc bug that was there from 4.5 to 4.9. So I guess could disable it for 5.0+ only. -Andi ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Additional debug info to aid cacheline analysis 2020-10-08 21:23 ` Andi Kleen @ 2020-10-10 20:58 ` Mark Wielaard 2020-10-10 21:51 ` Mark Wielaard ` (2 more replies) 0 siblings, 3 replies; 27+ messages in thread From: Mark Wielaard @ 2020-10-10 20:58 UTC (permalink / raw) To: Andi Kleen Cc: Peter Zijlstra, Stephane Eranian, linux-toolchains, Arnaldo Carvalho de Melo, linux-kernel, Ingo Molnar, Jiri Olsa, Namhyung Kim, Ian Rogers, Phillips, Kim, Mark Rutland, Masami Hiramatsu [-- Attachment #1: Type: text/plain, Size: 674 bytes --] On Thu, Oct 08, 2020 at 02:23:00PM -0700, Andi Kleen wrote: > > Basically you simply want to remove this line in the top-level > > Makefile: > > > > DEBUG_CFLAGS := $(call cc-option, -fno-var-tracking-assignments) > > It looks like this was needed as a workaround for a gcc bug that was there > from 4.5 to 4.9. > > So I guess could disable it for 5.0+ only. Yes, that would work. I don't know what the lowest supported GCC version is, but technically it was definitely fixed in 4.10.0, 4.8.4 and 4.9.2. And various distros would probably have backported the fix. But checking for 5.0+ would certainly give you a good version. How about the attached? Cheers, Mark [-- Attachment #2: 0001-Only-add-fno-var-tracking-assignments-workaround-for.patch --] [-- Type: text/x-diff, Size: 1283 bytes --] From 48628d3cf2d829a90cd6622355eada1b30cb10c1 Mon Sep 17 00:00:00 2001 From: Mark Wielaard <mark@klomp.org> Date: Sat, 10 Oct 2020 22:47:21 +0200 Subject: [PATCH] Only add -fno-var-tracking-assignments workaround for old GCC versions. Some old GCC versions between 4.5.0 and 4.9.1 might miscompile code with -fvar-tracking-assingments (which is enabled by default with -g -O2). commit 2062afb4f added -fno-var-tracking-assignments unconditionally to workaround this. But newer versions of GCC no longer have this bug, so only add it for versions of GCC before 5.0. Signed-off-by: Mark Wielaard <mark@klomp.org> --- Makefile | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index f84d7e4ca0be..4f4a9416a87a 100644 --- a/Makefile +++ b/Makefile @@ -813,7 +813,9 @@ KBUILD_CFLAGS += -ftrivial-auto-var-init=zero KBUILD_CFLAGS += -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang endif -DEBUG_CFLAGS := $(call cc-option, -fno-var-tracking-assignments) +# Workaround https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61801 +# for old versions of GCC. +DEBUG_CFLAGS := $(call cc-ifversion, -lt, 0500, $(call cc-option, -fno-var-tracking-assignments)) ifdef CONFIG_DEBUG_INFO ifdef CONFIG_DEBUG_INFO_SPLIT -- 2.18.4 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: Additional debug info to aid cacheline analysis 2020-10-10 20:58 ` Mark Wielaard @ 2020-10-10 21:51 ` Mark Wielaard [not found] ` <20201010220712.5352-1-mark@klomp.org> 2020-10-10 22:33 ` [PATCH] " Mark Wielaard 2020-10-11 11:04 ` Additional debug info to aid cacheline analysis Segher Boessenkool 2020-10-11 12:15 ` Florian Weimer 2 siblings, 2 replies; 27+ messages in thread From: Mark Wielaard @ 2020-10-10 21:51 UTC (permalink / raw) To: Andi Kleen Cc: Peter Zijlstra, Stephane Eranian, linux-toolchains, Arnaldo Carvalho de Melo, linux-kernel, Ingo Molnar, Jiri Olsa, Namhyung Kim, Ian Rogers, Phillips, Kim, Mark Rutland, Masami Hiramatsu On Sat, Oct 10, 2020 at 10:58:36PM +0200, Mark Wielaard wrote: > Yes, that would work. I don't know what the lowest supported GCC > version is, but technically it was definitely fixed in 4.10.0, 4.8.4 > and 4.9.2. And various distros would probably have backported the > fix. But checking for 5.0+ would certainly give you a good version. > > How about the attached? Looks like vger just throws away emails with patch attachements. How odd. I'll try sending it as reply to this message with git-send-email. Cheers, Mark ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <20201010220712.5352-1-mark@klomp.org>]
* Re: [PATCH] Only add -fno-var-tracking-assignments workaround for old GCC versions. [not found] ` <20201010220712.5352-1-mark@klomp.org> @ 2020-10-10 22:21 ` Ian Rogers 2020-10-12 18:59 ` Nick Desaulniers 2020-10-14 11:01 ` Mark Wielaard 0 siblings, 2 replies; 27+ messages in thread From: Ian Rogers @ 2020-10-10 22:21 UTC (permalink / raw) To: Mark Wielaard Cc: Andi Kleen, linux-toolchains, LKML, Peter Zijlstra, Stephane Eranian, Arnaldo Carvalho de Melo, Ingo Molnar, Jiri Olsa, Namhyung Kim, Phillips, Kim, Mark Rutland, Masami Hiramatsu On Sat, Oct 10, 2020 at 3:08 PM Mark Wielaard <mark@klomp.org> wrote: > > Some old GCC versions between 4.5.0 and 4.9.1 might miscompile code > with -fvar-tracking-assingments (which is enabled by default with -g -O2). > commit 2062afb4f added -fno-var-tracking-assignments unconditionally to > work around this. But newer versions of GCC no longer have this bug, so > only add it for versions of GCC before 5.0. > > Signed-off-by: Mark Wielaard <mark@klomp.org> Acked-by: Ian Rogers <irogers@google.com> Thanks, Ian > --- > Makefile | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/Makefile b/Makefile > index f84d7e4ca0be..4f4a9416a87a 100644 > --- a/Makefile > +++ b/Makefile > @@ -813,7 +813,9 @@ KBUILD_CFLAGS += -ftrivial-auto-var-init=zero > KBUILD_CFLAGS += -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang > endif > > -DEBUG_CFLAGS := $(call cc-option, -fno-var-tracking-assignments) > +# Workaround https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61801 > +# for old versions of GCC. > +DEBUG_CFLAGS := $(call cc-ifversion, -lt, 0500, $(call cc-option, -fno-var-tracking-assignments)) > > ifdef CONFIG_DEBUG_INFO > ifdef CONFIG_DEBUG_INFO_SPLIT > -- > 2.18.4 > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Only add -fno-var-tracking-assignments workaround for old GCC versions. 2020-10-10 22:21 ` [PATCH] Only add -fno-var-tracking-assignments workaround for old GCC versions Ian Rogers @ 2020-10-12 18:59 ` Nick Desaulniers 2020-10-12 19:12 ` Mark Wielaard 2020-10-14 11:01 ` Mark Wielaard 1 sibling, 1 reply; 27+ messages in thread From: Nick Desaulniers @ 2020-10-12 18:59 UTC (permalink / raw) To: Ian Rogers Cc: Mark Wielaard, Andi Kleen, linux-toolchains, LKML, Peter Zijlstra, Stephane Eranian, Arnaldo Carvalho de Melo, Ingo Molnar, Jiri Olsa, Namhyung Kim, Phillips, Kim, Mark Rutland, Masami Hiramatsu On Sat, Oct 10, 2020 at 3:57 PM Ian Rogers <irogers@google.com> wrote: > > On Sat, Oct 10, 2020 at 3:08 PM Mark Wielaard <mark@klomp.org> wrote: > > > > Some old GCC versions between 4.5.0 and 4.9.1 might miscompile code > > with -fvar-tracking-assingments (which is enabled by default with -g -O2). > > commit 2062afb4f added -fno-var-tracking-assignments unconditionally to > > work around this. But newer versions of GCC no longer have this bug, so > > only add it for versions of GCC before 5.0. > > > > Signed-off-by: Mark Wielaard <mark@klomp.org> > > Acked-by: Ian Rogers <irogers@google.com> > > Thanks, > Ian > > > --- > > Makefile | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/Makefile b/Makefile > > index f84d7e4ca0be..4f4a9416a87a 100644 > > --- a/Makefile > > +++ b/Makefile > > @@ -813,7 +813,9 @@ KBUILD_CFLAGS += -ftrivial-auto-var-init=zero > > KBUILD_CFLAGS += -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang > > endif > > > > -DEBUG_CFLAGS := $(call cc-option, -fno-var-tracking-assignments) > > +# Workaround https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61801 > > +# for old versions of GCC. > > +DEBUG_CFLAGS := $(call cc-ifversion, -lt, 0500, $(call cc-option, -fno-var-tracking-assignments)) Should this be wrapped in: `ifdef CONFIG_CC_IS_GCC`/`endif`? > > > > ifdef CONFIG_DEBUG_INFO > > ifdef CONFIG_DEBUG_INFO_SPLIT > > -- > > 2.18.4 > > -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Only add -fno-var-tracking-assignments workaround for old GCC versions. 2020-10-12 18:59 ` Nick Desaulniers @ 2020-10-12 19:12 ` Mark Wielaard 2020-10-14 15:31 ` Sedat Dilek 0 siblings, 1 reply; 27+ messages in thread From: Mark Wielaard @ 2020-10-12 19:12 UTC (permalink / raw) To: Nick Desaulniers, Ian Rogers Cc: Andi Kleen, linux-toolchains, LKML, Peter Zijlstra, Stephane Eranian, Arnaldo Carvalho de Melo, Ingo Molnar, Jiri Olsa, Namhyung Kim, Phillips, Kim, Mark Rutland, Masami Hiramatsu Hi, On Mon, 2020-10-12 at 11:59 -0700, Nick Desaulniers wrote: > On Sat, Oct 10, 2020 at 3:57 PM Ian Rogers <irogers@google.com> > wrote: > > On Sat, Oct 10, 2020 at 3:08 PM Mark Wielaard <mark@klomp.org> > > wrote: > > > -DEBUG_CFLAGS := $(call cc-option, -fno-var-tracking- > > > assignments) > > > +# Workaround https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61801 > > > +# for old versions of GCC. > > > +DEBUG_CFLAGS := $(call cc-ifversion, -lt, 0500, $(call cc- > > > option, -fno-var-tracking-assignments)) > > Should this be wrapped in: `ifdef CONFIG_CC_IS_GCC`/`endif`? I don't think so. It wasn't before. And call cc-option makes sure to only add the flag if the compiler supports it (clang doesn't and it also has a much higher version). Cheers, Mark ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] Only add -fno-var-tracking-assignments workaround for old GCC versions. 2020-10-12 19:12 ` Mark Wielaard @ 2020-10-14 15:31 ` Sedat Dilek 0 siblings, 0 replies; 27+ messages in thread From: Sedat Dilek @ 2020-10-14 15:31 UTC (permalink / raw) To: Mark Wielaard Cc: Nick Desaulniers, Ian Rogers, Andi Kleen, linux-toolchains, LKML, Peter Zijlstra, Stephane Eranian, Arnaldo Carvalho de Melo, Ingo Molnar, Jiri Olsa, Namhyung Kim, Phillips, Kim, Mark Rutland, Masami Hiramatsu On Mon, Oct 12, 2020 at 9:12 PM Mark Wielaard <mark@klomp.org> wrote: > > Hi, > > On Mon, 2020-10-12 at 11:59 -0700, Nick Desaulniers wrote: > > On Sat, Oct 10, 2020 at 3:57 PM Ian Rogers <irogers@google.com> > > wrote: > > > On Sat, Oct 10, 2020 at 3:08 PM Mark Wielaard <mark@klomp.org> > > > wrote: > > > > -DEBUG_CFLAGS := $(call cc-option, -fno-var-tracking- > > > > assignments) > > > > +# Workaround https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61801 > > > > +# for old versions of GCC. > > > > +DEBUG_CFLAGS := $(call cc-ifversion, -lt, 0500, $(call cc- > > > > option, -fno-var-tracking-assignments)) > > > > Should this be wrapped in: `ifdef CONFIG_CC_IS_GCC`/`endif`? > > I don't think so. It wasn't before. And call cc-option makes sure to > only add the flag if the compiler supports it (clang doesn't and it > also has a much higher version). > I am also in favour of `ifdef CONFIG_CC_IS_GCC` to clearly say this is a GCC bug. For the comment something like: # Workaround for GCC version <= 5.0 # GCC Bug: <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61801> Think of people grepping in the Linux source code for supported or broken compiler (versions)... As a reference see ClangBuiltLinux issue #427 "audit use of __GNUC__". [2] says: "There's also a ton of __GNUC_MINOR__ checks against unsupported GCC versions." - Sedat - [1] https://github.com/ClangBuiltLinux/linux/issues/427 [2] https://github.com/ClangBuiltLinux/linux/issues/427#issuecomment-700935241 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH] Only add -fno-var-tracking-assignments workaround for old GCC versions. 2020-10-10 22:21 ` [PATCH] Only add -fno-var-tracking-assignments workaround for old GCC versions Ian Rogers 2020-10-12 18:59 ` Nick Desaulniers @ 2020-10-14 11:01 ` Mark Wielaard 2020-10-14 15:17 ` Andi Kleen 2020-10-17 12:01 ` [PATCH V2] " Mark Wielaard 1 sibling, 2 replies; 27+ messages in thread From: Mark Wielaard @ 2020-10-14 11:01 UTC (permalink / raw) To: linux-kernel, Masahiro Yamada, Michal Marek, linux-kbuild Cc: Ian Rogers, Mark Wielaard, linux-toolchains, Andi Kleen, Nick Desaulniers, Segher Boessenkool, Florian Weimer Some old GCC versions between 4.5.0 and 4.9.1 might miscompile code with -fvar-tracking-assingments (which is enabled by default with -g -O2). commit 2062afb4f added -fno-var-tracking-assignments unconditionally to work around this. But newer versions of GCC no longer have this bug, so only add it for versions of GCC before 5.0. Signed-off-by: Mark Wielaard <mark@klomp.org> Acked-by: Ian Rogers <irogers@google.com> Cc: linux-toolchains@vger.kernel.org Cc: Andi Kleen <andi@firstfloor.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Florian Weimer <fw@deneb.enyo.de> --- Makefile | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 51540b291738..8477fee5f309 100644 --- a/Makefile +++ b/Makefile @@ -813,7 +813,9 @@ KBUILD_CFLAGS += -ftrivial-auto-var-init=zero KBUILD_CFLAGS += -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang endif -DEBUG_CFLAGS := $(call cc-option, -fno-var-tracking-assignments) +# Workaround https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61801 +# for old versions of GCC. +DEBUG_CFLAGS := $(call cc-ifversion, -lt, 0500, $(call cc-option, -fno-var-tracking-assignments)) ifdef CONFIG_DEBUG_INFO ifdef CONFIG_DEBUG_INFO_SPLIT -- 2.18.4 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH] Only add -fno-var-tracking-assignments workaround for old GCC versions. 2020-10-14 11:01 ` Mark Wielaard @ 2020-10-14 15:17 ` Andi Kleen 2020-10-17 12:01 ` [PATCH V2] " Mark Wielaard 1 sibling, 0 replies; 27+ messages in thread From: Andi Kleen @ 2020-10-14 15:17 UTC (permalink / raw) To: Mark Wielaard Cc: linux-kernel, Masahiro Yamada, Michal Marek, linux-kbuild, Ian Rogers, linux-toolchains, Andi Kleen, Nick Desaulniers, Segher Boessenkool, Florian Weimer On Wed, Oct 14, 2020 at 01:01:32PM +0200, Mark Wielaard wrote: > Some old GCC versions between 4.5.0 and 4.9.1 might miscompile code > with -fvar-tracking-assingments (which is enabled by default with -g -O2). > commit 2062afb4f added -fno-var-tracking-assignments unconditionally to > work around this. But newer versions of GCC no longer have this bug, so > only add it for versions of GCC before 5.0. Add ... This allows various tools such as a perf probe or gdb debuggers or systemtap to resolve variable locations using dwarf locations in more code. > > Signed-off-by: Mark Wielaard <mark@klomp.org> > Acked-by: Ian Rogers <irogers@google.com> > Cc: linux-toolchains@vger.kernel.org > Cc: Andi Kleen <andi@firstfloor.org> > Cc: Nick Desaulniers <ndesaulniers@google.com> > Cc: Segher Boessenkool <segher@kernel.crashing.org> > Cc: Florian Weimer <fw@deneb.enyo.de> Reviewed-by: Andi Kleen <ak@linux.intel.com> -Andi ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH V2] Only add -fno-var-tracking-assignments workaround for old GCC versions. 2020-10-14 11:01 ` Mark Wielaard 2020-10-14 15:17 ` Andi Kleen @ 2020-10-17 12:01 ` Mark Wielaard 2020-10-19 19:30 ` Nick Desaulniers 2020-10-20 15:27 ` Masahiro Yamada 1 sibling, 2 replies; 27+ messages in thread From: Mark Wielaard @ 2020-10-17 12:01 UTC (permalink / raw) To: linux-kernel, Masahiro Yamada, Michal Marek, linux-kbuild Cc: Ian Rogers, Andi Kleen, Mark Wielaard, linux-toolchains, Nick Desaulniers, Segher Boessenkool, Florian Weimer, Sedat Dilek Some old GCC versions between 4.5.0 and 4.9.1 might miscompile code with -fvar-tracking-assingments (which is enabled by default with -g -O2). commit 2062afb4f added -fno-var-tracking-assignments unconditionally to work around this. But newer versions of GCC no longer have this bug, so only add it for versions of GCC before 5.0. This allows various tools such as a perf probe or gdb debuggers or systemtap to resolve variable locations using dwarf locations in more code. Changes in V2: - Update commit message explaining purpose. - Explicitly mention GCC version in comment. - Wrap workaround in ifdef CONFIG_CC_IS_GCC Signed-off-by: Mark Wielaard <mark@klomp.org> Acked-by: Ian Rogers <irogers@google.com> Reviewed-by: Andi Kleen <andi@firstfloor.org> Cc: linux-toolchains@vger.kernel.org Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Florian Weimer <fw@deneb.enyo.de> Cc: Sedat Dilek <sedat.dilek@gmail.com> --- Makefile | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 51540b291738..964754b4cedf 100644 --- a/Makefile +++ b/Makefile @@ -813,7 +813,11 @@ KBUILD_CFLAGS += -ftrivial-auto-var-init=zero KBUILD_CFLAGS += -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang endif -DEBUG_CFLAGS := $(call cc-option, -fno-var-tracking-assignments) +# Workaround for GCC versions < 5.0 +# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61801 +ifdef CONFIG_CC_IS_GCC +DEBUG_CFLAGS := $(call cc-ifversion, -lt, 0500, $(call cc-option, -fno-var-tracking-assignments)) +endif ifdef CONFIG_DEBUG_INFO ifdef CONFIG_DEBUG_INFO_SPLIT -- 2.18.4 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH V2] Only add -fno-var-tracking-assignments workaround for old GCC versions. 2020-10-17 12:01 ` [PATCH V2] " Mark Wielaard @ 2020-10-19 19:30 ` Nick Desaulniers 2020-10-20 15:27 ` Masahiro Yamada 1 sibling, 0 replies; 27+ messages in thread From: Nick Desaulniers @ 2020-10-19 19:30 UTC (permalink / raw) To: Mark Wielaard Cc: LKML, Masahiro Yamada, Michal Marek, Linux Kbuild mailing list, Ian Rogers, Andi Kleen, linux-toolchains, Segher Boessenkool, Florian Weimer, Sedat Dilek On Sat, Oct 17, 2020 at 5:02 AM Mark Wielaard <mark@klomp.org> wrote: > > Some old GCC versions between 4.5.0 and 4.9.1 might miscompile code > with -fvar-tracking-assingments (which is enabled by default with -g -O2). > commit 2062afb4f added -fno-var-tracking-assignments unconditionally to > work around this. But newer versions of GCC no longer have this bug, so > only add it for versions of GCC before 5.0. This allows various tools > such as a perf probe or gdb debuggers or systemtap to resolve variable > locations using dwarf locations in more code. > > Changes in V2: > - Update commit message explaining purpose. > - Explicitly mention GCC version in comment. > - Wrap workaround in ifdef CONFIG_CC_IS_GCC > > Signed-off-by: Mark Wielaard <mark@klomp.org> > Acked-by: Ian Rogers <irogers@google.com> > Reviewed-by: Andi Kleen <andi@firstfloor.org> > Cc: linux-toolchains@vger.kernel.org > Cc: Nick Desaulniers <ndesaulniers@google.com> > Cc: Segher Boessenkool <segher@kernel.crashing.org> > Cc: Florian Weimer <fw@deneb.enyo.de> > Cc: Sedat Dilek <sedat.dilek@gmail.com> > --- > Makefile | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/Makefile b/Makefile > index 51540b291738..964754b4cedf 100644 > --- a/Makefile > +++ b/Makefile > @@ -813,7 +813,11 @@ KBUILD_CFLAGS += -ftrivial-auto-var-init=zero > KBUILD_CFLAGS += -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang > endif > > -DEBUG_CFLAGS := $(call cc-option, -fno-var-tracking-assignments) > +# Workaround for GCC versions < 5.0 > +# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61801 > +ifdef CONFIG_CC_IS_GCC > +DEBUG_CFLAGS := $(call cc-ifversion, -lt, 0500, $(call cc-option, -fno-var-tracking-assignments)) Thanks for adding the comment. That will help us find+remove this when the kernel's minimum supported version of GCC advances to gcc-5. The current minimum supported version of GCC according to Documentation/process/changes.rst is gcc-4.9 (so anything older is irrelevant, and we drop support for it). If gcc 4.9 supports `-fno-var-tracking-assignments` (it looks like it does: https://godbolt.org/z/oa53f5), then we should drop the `cc-option` call, which will save us a compiler invocation for each invocation of `make`. > +endif > > ifdef CONFIG_DEBUG_INFO > ifdef CONFIG_DEBUG_INFO_SPLIT > -- > 2.18.4 > -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V2] Only add -fno-var-tracking-assignments workaround for old GCC versions. 2020-10-17 12:01 ` [PATCH V2] " Mark Wielaard 2020-10-19 19:30 ` Nick Desaulniers @ 2020-10-20 15:27 ` Masahiro Yamada 1 sibling, 0 replies; 27+ messages in thread From: Masahiro Yamada @ 2020-10-20 15:27 UTC (permalink / raw) To: Mark Wielaard Cc: Linux Kernel Mailing List, Michal Marek, Linux Kbuild mailing list, Ian Rogers, Andi Kleen, linux-toolchains, Nick Desaulniers, Segher Boessenkool, Florian Weimer, Sedat Dilek On Sat, Oct 17, 2020 at 9:02 PM Mark Wielaard <mark@klomp.org> wrote: > > Some old GCC versions between 4.5.0 and 4.9.1 might miscompile code > with -fvar-tracking-assingments (which is enabled by default with -g -O2). > commit 2062afb4f added -fno-var-tracking-assignments unconditionally to > work around this. But newer versions of GCC no longer have this bug, so > only add it for versions of GCC before 5.0. This allows various tools > such as a perf probe or gdb debuggers or systemtap to resolve variable > locations using dwarf locations in more code. > > Changes in V2: > - Update commit message explaining purpose. > - Explicitly mention GCC version in comment. > - Wrap workaround in ifdef CONFIG_CC_IS_GCC > > Signed-off-by: Mark Wielaard <mark@klomp.org> > Acked-by: Ian Rogers <irogers@google.com> > Reviewed-by: Andi Kleen <andi@firstfloor.org> > Cc: linux-toolchains@vger.kernel.org > Cc: Nick Desaulniers <ndesaulniers@google.com> > Cc: Segher Boessenkool <segher@kernel.crashing.org> > Cc: Florian Weimer <fw@deneb.enyo.de> > Cc: Sedat Dilek <sedat.dilek@gmail.com> > --- Applied to linux-kbuild. Thanks. > Makefile | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/Makefile b/Makefile > index 51540b291738..964754b4cedf 100644 > --- a/Makefile > +++ b/Makefile > @@ -813,7 +813,11 @@ KBUILD_CFLAGS += -ftrivial-auto-var-init=zero > KBUILD_CFLAGS += -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang > endif > > -DEBUG_CFLAGS := $(call cc-option, -fno-var-tracking-assignments) > +# Workaround for GCC versions < 5.0 > +# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61801 > +ifdef CONFIG_CC_IS_GCC > +DEBUG_CFLAGS := $(call cc-ifversion, -lt, 0500, $(call cc-option, -fno-var-tracking-assignments)) > +endif > > ifdef CONFIG_DEBUG_INFO > ifdef CONFIG_DEBUG_INFO_SPLIT > -- > 2.18.4 > -- Best Regards Masahiro Yamada ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH] Only add -fno-var-tracking-assignments workaround for old GCC versions. 2020-10-10 21:51 ` Mark Wielaard [not found] ` <20201010220712.5352-1-mark@klomp.org> @ 2020-10-10 22:33 ` Mark Wielaard 1 sibling, 0 replies; 27+ messages in thread From: Mark Wielaard @ 2020-10-10 22:33 UTC (permalink / raw) To: linux-toolchains, linux-kernel; +Cc: Peter Zijlstra, Mark Wielaard Some old GCC versions between 4.5.0 and 4.9.1 might miscompile code with -fvar-tracking-assingments (which is enabled by default with -g -O2). commit 2062afb4f added -fno-var-tracking-assignments unconditionally to work around this. But newer versions of GCC no longer have this bug, so only add it for versions of GCC before 5.0. Signed-off-by: Mark Wielaard <mark@klomp.org> --- Makefile | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index f84d7e4ca0be..4f4a9416a87a 100644 --- a/Makefile +++ b/Makefile @@ -813,7 +813,9 @@ KBUILD_CFLAGS += -ftrivial-auto-var-init=zero KBUILD_CFLAGS += -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang endif -DEBUG_CFLAGS := $(call cc-option, -fno-var-tracking-assignments) +# Workaround https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61801 +# for old versions of GCC. +DEBUG_CFLAGS := $(call cc-ifversion, -lt, 0500, $(call cc-option, -fno-var-tracking-assignments)) ifdef CONFIG_DEBUG_INFO ifdef CONFIG_DEBUG_INFO_SPLIT -- 2.18.4 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: Additional debug info to aid cacheline analysis 2020-10-10 20:58 ` Mark Wielaard 2020-10-10 21:51 ` Mark Wielaard @ 2020-10-11 11:04 ` Segher Boessenkool 2020-10-11 12:15 ` Florian Weimer 2 siblings, 0 replies; 27+ messages in thread From: Segher Boessenkool @ 2020-10-11 11:04 UTC (permalink / raw) To: Mark Wielaard Cc: Andi Kleen, Peter Zijlstra, Stephane Eranian, linux-toolchains, Arnaldo Carvalho de Melo, linux-kernel, Ingo Molnar, Jiri Olsa, Namhyung Kim, Ian Rogers, Phillips, Kim, Mark Rutland, Masami Hiramatsu Hi! On Sat, Oct 10, 2020 at 10:58:36PM +0200, Mark Wielaard wrote: > On Thu, Oct 08, 2020 at 02:23:00PM -0700, Andi Kleen wrote: > > So I guess could disable it for 5.0+ only. > > Yes, that would work. I don't know what the lowest supported GCC > version is, but technically it was definitely fixed in 4.10.0, 4.8.4 > and 4.9.2. Fwiw, GCC 4.10 was renamed to GCC 5 before it was released (it was the first release with the new version number scheme). Only old development versions (that no one should use) identify as 4.10. > And various distros would probably have backported the > fix. But checking for 5.0+ would certainly give you a good version. Yes, esp. since some versions of 4.9 and 4.8 are still buggy. No one should use any version for which a newer bug-fix release has long been available, but do you want to deal with bugs from people who do not? Segher ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Additional debug info to aid cacheline analysis 2020-10-10 20:58 ` Mark Wielaard 2020-10-10 21:51 ` Mark Wielaard 2020-10-11 11:04 ` Additional debug info to aid cacheline analysis Segher Boessenkool @ 2020-10-11 12:15 ` Florian Weimer 2020-10-11 12:23 ` Mark Wielaard 2 siblings, 1 reply; 27+ messages in thread From: Florian Weimer @ 2020-10-11 12:15 UTC (permalink / raw) To: Mark Wielaard Cc: Andi Kleen, Peter Zijlstra, Stephane Eranian, linux-toolchains, Arnaldo Carvalho de Melo, linux-kernel, Ingo Molnar, Jiri Olsa, Namhyung Kim, Ian Rogers, Phillips, Kim, Mark Rutland, Masami Hiramatsu * Mark Wielaard: > Yes, that would work. I don't know what the lowest supported GCC > version is, but technically it was definitely fixed in 4.10.0, 4.8.4 > and 4.9.2. And various distros would probably have backported the > fix. But checking for 5.0+ would certainly give you a good version. > > How about the attached? Would it be possible to test for the actual presence of the bug, using -fcompare-debug? (But it seems to me that the treatment of this particular compiler bug is an outlier: other equally tricky bugs do not receive this kind of attention.) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Additional debug info to aid cacheline analysis 2020-10-11 12:15 ` Florian Weimer @ 2020-10-11 12:23 ` Mark Wielaard 2020-10-11 12:28 ` Florian Weimer 0 siblings, 1 reply; 27+ messages in thread From: Mark Wielaard @ 2020-10-11 12:23 UTC (permalink / raw) To: Florian Weimer Cc: Andi Kleen, Peter Zijlstra, Stephane Eranian, linux-toolchains, Arnaldo Carvalho de Melo, linux-kernel, Ingo Molnar, Jiri Olsa, Namhyung Kim, Ian Rogers, Phillips, Kim, Mark Rutland, Masami Hiramatsu On Sun, Oct 11, 2020 at 02:15:18PM +0200, Florian Weimer wrote: > * Mark Wielaard: > > > Yes, that would work. I don't know what the lowest supported GCC > > version is, but technically it was definitely fixed in 4.10.0, 4.8.4 > > and 4.9.2. And various distros would probably have backported the > > fix. But checking for 5.0+ would certainly give you a good version. > > > > How about the attached? > > Would it be possible to test for the actual presence of the bug, using > -fcompare-debug? Yes, that was discussed in the original commit message, but it was decided that disabling it unconditionaly was easier. See commit 2062afb4f. Cheers, Mark ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Additional debug info to aid cacheline analysis 2020-10-11 12:23 ` Mark Wielaard @ 2020-10-11 12:28 ` Florian Weimer 0 siblings, 0 replies; 27+ messages in thread From: Florian Weimer @ 2020-10-11 12:28 UTC (permalink / raw) To: Mark Wielaard Cc: Andi Kleen, Peter Zijlstra, Stephane Eranian, linux-toolchains, Arnaldo Carvalho de Melo, linux-kernel, Ingo Molnar, Jiri Olsa, Namhyung Kim, Ian Rogers, Phillips, Kim, Mark Rutland, Masami Hiramatsu * Mark Wielaard: > On Sun, Oct 11, 2020 at 02:15:18PM +0200, Florian Weimer wrote: >> * Mark Wielaard: >> >> > Yes, that would work. I don't know what the lowest supported GCC >> > version is, but technically it was definitely fixed in 4.10.0, 4.8.4 >> > and 4.9.2. And various distros would probably have backported the >> > fix. But checking for 5.0+ would certainly give you a good version. >> > >> > How about the attached? >> >> Would it be possible to test for the actual presence of the bug, using >> -fcompare-debug? > > Yes, that was discussed in the original commit message, but it was decided > that disabling it unconditionaly was easier. See commit 2062afb4f. I think the short test case was not yet available at the time of the Linux commit. But then it may not actually detect the bug in all affected compilers. Anyway, making this conditional on the GCC version is already a clear improvement. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Additional debug info to aid cacheline analysis 2020-10-08 9:32 ` Mark Wielaard 2020-10-08 21:23 ` Andi Kleen @ 2020-10-30 5:26 ` Namhyung Kim 2020-10-30 9:16 ` Mark Wielaard 1 sibling, 1 reply; 27+ messages in thread From: Namhyung Kim @ 2020-10-30 5:26 UTC (permalink / raw) To: Mark Wielaard Cc: Peter Zijlstra, Stephane Eranian, linux-toolchains, Arnaldo Carvalho de Melo, linux-kernel, Ingo Molnar, Jiri Olsa, Ian Rogers, Phillips, Kim, Mark Rutland, Andi Kleen, Masami Hiramatsu Hello, On Thu, Oct 8, 2020 at 6:38 PM Mark Wielaard <mark@klomp.org> wrote: > > Hi, > > On Thu, 2020-10-08 at 09:02 +0200, Peter Zijlstra wrote: > > Some time ago, I had my intern pursue the other 2 approaches for > > > symbolization. The one I see as most promising is by using the DWARF > > > information (no BPF needed). The good news is that I believe we do not > > > need more information than what is already there. We just need the > > > compiler to generate valid DWARF at most optimization levels, which I > > > believe is not the case for LLVM based compilers but maybe okay for > > > GCC. > > > > Right, I think GCC improved a lot on this front over the past few years. > > Also added Andi and Masami, who have worked on this or related topics. > > For GCC Alexandre Oliva did a really thorough write up of all the > various optimization and their effect on debugging/DWARF: > https://www.fsfla.org/~lxoliva/writeups/gOlogy/gOlogy.html Thanks for the link. Looks very nice. > > GCC using -fvar-tracking and -fvar-tracking-assignments is pretty good > at keeping track of where variables are held (in memory or registers) > when in the program, even through various optimizations. > > -fvar-tracking-assignments is the default with -g -O2. > Except for the upstream linux kernel code. Most distros enable it > again, but you do want to enable it by hand when building from the > upstream linux git repo. Please correct me if I'm wrong. This seems to track local variables. But I'm not sure it's enough for this purpose as we want to know types of any memory references (not directly from a variable). Let's say we have a variable like below: struct xxx a; a.b->c->d++; And we have a sample where 'd' is updated, then how can we know it's from the variable 'a'? Maybe we don't need to know it, but we should know it accesses the 'd' field in the struct 'c'. Probably we can analyze the asm code and figure out it's from 'a' and accessing 'd' at the moment. I'm curious if there's a way in the DWARF to help this kind of work. Thanks Namhyung ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Additional debug info to aid cacheline analysis 2020-10-30 5:26 ` Namhyung Kim @ 2020-10-30 9:16 ` Mark Wielaard 2020-10-30 10:10 ` Peter Zijlstra 0 siblings, 1 reply; 27+ messages in thread From: Mark Wielaard @ 2020-10-30 9:16 UTC (permalink / raw) To: Namhyung Kim Cc: Peter Zijlstra, Stephane Eranian, linux-toolchains, Arnaldo Carvalho de Melo, linux-kernel, Ingo Molnar, Jiri Olsa, Ian Rogers, Phillips, Kim, Mark Rutland, Andi Kleen, Masami Hiramatsu Hi Namhyung, On Fri, Oct 30, 2020 at 02:26:19PM +0900, Namhyung Kim wrote: > On Thu, Oct 8, 2020 at 6:38 PM Mark Wielaard <mark@klomp.org> wrote: > > GCC using -fvar-tracking and -fvar-tracking-assignments is pretty good > > at keeping track of where variables are held (in memory or registers) > > when in the program, even through various optimizations. > > > > -fvar-tracking-assignments is the default with -g -O2. > > Except for the upstream linux kernel code. Most distros enable it > > again, but you do want to enable it by hand when building from the > > upstream linux git repo. > > Please correct me if I'm wrong. This seems to track local variables. > But I'm not sure it's enough for this purpose as we want to know > types of any memory references (not directly from a variable). > > Let's say we have a variable like below: > > struct xxx a; > > a.b->c->d++; > > And we have a sample where 'd' is updated, then how can we know > it's from the variable 'a'? Maybe we don't need to know it, but we > should know it accesses the 'd' field in the struct 'c'. > > Probably we can analyze the asm code and figure out it's from 'a' > and accessing 'd' at the moment. I'm curious if there's a way in > the DWARF to help this kind of work. DWARF does have that information, but it stores it in a way that is kind of opposite to how you want to access it. Given a variable and an address, you can easily get the location where that variable is stored. But if you want to map back from a given (memory) location and address to the variable, that is more work. In theory what you could do is make a list of global variables from the top-level DWARF CUs. Then take the debug aranges to map from the program address to the DWARF CU that covers that address. Then for that CU you would walk the CU DIE tree while keeping track of all variables in scope till you find the function covering that address. Then for each global variable and all variables in scope you get the DWARF location description at the given address (for global ones that is most likely always a static address, but for local ones it depends on where exactly in the program you take the sample). That plus the type information for each variable should then make it possible to see which variable covers the given memory location. But that is a lot of work to do for each sample. Cheers, Mark ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Additional debug info to aid cacheline analysis 2020-10-30 9:16 ` Mark Wielaard @ 2020-10-30 10:10 ` Peter Zijlstra 2020-11-02 8:27 ` Masami Hiramatsu 0 siblings, 1 reply; 27+ messages in thread From: Peter Zijlstra @ 2020-10-30 10:10 UTC (permalink / raw) To: Mark Wielaard Cc: Namhyung Kim, Stephane Eranian, linux-toolchains, Arnaldo Carvalho de Melo, linux-kernel, Ingo Molnar, Jiri Olsa, Ian Rogers, Phillips, Kim, Mark Rutland, Andi Kleen, Masami Hiramatsu On Fri, Oct 30, 2020 at 10:16:49AM +0100, Mark Wielaard wrote: > Hi Namhyung, > > On Fri, Oct 30, 2020 at 02:26:19PM +0900, Namhyung Kim wrote: > > On Thu, Oct 8, 2020 at 6:38 PM Mark Wielaard <mark@klomp.org> wrote: > > > GCC using -fvar-tracking and -fvar-tracking-assignments is pretty good > > > at keeping track of where variables are held (in memory or registers) > > > when in the program, even through various optimizations. > > > > > > -fvar-tracking-assignments is the default with -g -O2. > > > Except for the upstream linux kernel code. Most distros enable it > > > again, but you do want to enable it by hand when building from the > > > upstream linux git repo. > > > > Please correct me if I'm wrong. This seems to track local variables. > > But I'm not sure it's enough for this purpose as we want to know > > types of any memory references (not directly from a variable). > > > > Let's say we have a variable like below: > > > > struct xxx a; > > > > a.b->c->d++; > > > > And we have a sample where 'd' is updated, then how can we know > > it's from the variable 'a'? Maybe we don't need to know it, but we > > should know it accesses the 'd' field in the struct 'c'. > > > > Probably we can analyze the asm code and figure out it's from 'a' > > and accessing 'd' at the moment. I'm curious if there's a way in > > the DWARF to help this kind of work. > > DWARF does have that information, but it stores it in a way that is > kind of opposite to how you want to access it. Given a variable and an > address, you can easily get the location where that variable is > stored. But if you want to map back from a given (memory) location and > address to the variable, that is more work. The principal idea in this thread doesn't care about the address of the variables. The idea was to get the data type and member information from the instruction. So in the above example: a.b->c->d++; what we'll end up with is something like: inc 8(%rax) Where %rax contains c, and the offset of d in c is 8. So what we want to (easily) find for that instruction is c::d. So given any instruction with a memop (either load or store) we want to find: type::member. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Additional debug info to aid cacheline analysis 2020-10-30 10:10 ` Peter Zijlstra @ 2020-11-02 8:27 ` Masami Hiramatsu 2020-11-03 4:22 ` Namhyung Kim 0 siblings, 1 reply; 27+ messages in thread From: Masami Hiramatsu @ 2020-11-02 8:27 UTC (permalink / raw) To: Peter Zijlstra Cc: Mark Wielaard, Namhyung Kim, Stephane Eranian, linux-toolchains, Arnaldo Carvalho de Melo, linux-kernel, Ingo Molnar, Jiri Olsa, Ian Rogers, Phillips, Kim, Mark Rutland, Andi Kleen, Masami Hiramatsu Hi, On Fri, 30 Oct 2020 11:10:04 +0100 Peter Zijlstra <peterz@infradead.org> wrote: > On Fri, Oct 30, 2020 at 10:16:49AM +0100, Mark Wielaard wrote: > > Hi Namhyung, > > > > On Fri, Oct 30, 2020 at 02:26:19PM +0900, Namhyung Kim wrote: > > > On Thu, Oct 8, 2020 at 6:38 PM Mark Wielaard <mark@klomp.org> wrote: > > > > GCC using -fvar-tracking and -fvar-tracking-assignments is pretty good > > > > at keeping track of where variables are held (in memory or registers) > > > > when in the program, even through various optimizations. > > > > > > > > -fvar-tracking-assignments is the default with -g -O2. > > > > Except for the upstream linux kernel code. Most distros enable it > > > > again, but you do want to enable it by hand when building from the > > > > upstream linux git repo. > > > > > > Please correct me if I'm wrong. This seems to track local variables. > > > But I'm not sure it's enough for this purpose as we want to know > > > types of any memory references (not directly from a variable). > > > > > > Let's say we have a variable like below: > > > > > > struct xxx a; > > > > > > a.b->c->d++; > > > > > > And we have a sample where 'd' is updated, then how can we know > > > it's from the variable 'a'? Maybe we don't need to know it, but we > > > should know it accesses the 'd' field in the struct 'c'. > > > > > > Probably we can analyze the asm code and figure out it's from 'a' > > > and accessing 'd' at the moment. I'm curious if there's a way in > > > the DWARF to help this kind of work. > > > > DWARF does have that information, but it stores it in a way that is > > kind of opposite to how you want to access it. Given a variable and an > > address, you can easily get the location where that variable is > > stored. But if you want to map back from a given (memory) location and > > address to the variable, that is more work. > > The principal idea in this thread doesn't care about the address of the > variables. The idea was to get the data type and member information from > the instruction. > > So in the above example: a.b->c->d++; what we'll end up with is > something like: > > inc 8(%rax) > > Where %rax contains c, and the offset of d in c is 8. For this simple case, it is possible. This offset information is stored in the DWARF as a data-structure type information. (perf-probe uses it to find how to get the given local var's fields) So if we do this off-line, I think it is possible if it is recorded with instruction-pointers. For each place, we can do - decode instruction and get the access address. - get var assignment of %rax at that IP. - get type information of var and find the field from offset. However, the problem is that if the DWARF has only assignment of "a", we need to decode the function body. (and usually this happens) func() { struct xxx a; ... a.b->c->d++; } In this case, only "a" is the local variable. So DWARF records assignment of "a", not "b" nor "c" (since those are not a name of variables, just a name of fields). GCC may generate something like mov 16(%rsp),%rdx // rdx = a.b mov 8(%rdx),%rax // rax = b->c inc 8(%rax) // c->d++ GCC only knows "a" is 0(%rsp), there is no other "assignments". Thus we need to backtrace the %rax from the hit ip address until known assignment register appears. Note that if there is a loop, we have to trace it back too, but it's more hard, func() { struct yyy a; int i; ... for (i = 0; i < 100; i++) a.b->c[i]++; } In this case, GCC will optimize "i" out and make an end-address. (This is what GCC -O2 generated) 0000000000001190 <func>: { 1190: f3 0f 1e fa endbr64 struct yyy a = *_a; 1194: 48 8b 57 10 mov 0x10(%rdi),%rdx for (i = 0; i < 100; i++) 1198: 48 8d 42 08 lea 0x8(%rdx),%rax 119c: 48 81 c2 98 01 00 00 add $0x198,%rdx 11a3: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) a.b->c[i]++; 11a8: 83 00 01 addl $0x1,(%rax) for (i = 0; i < 100; i++) 11ab: 48 83 c0 04 add $0x4,%rax 11af: 48 39 d0 cmp %rdx,%rax 11b2: 75 f4 jne 11a8 <func+0x18> } 11b4: c3 retq If we ignore the array support, this can be simplified as 1194: 48 8b 57 10 mov 0x10(%rdi),%rdx 1198: 48 8d 42 08 lea 0x8(%rdx),%rax 11a8: 83 00 01 addl $0x1,(%rax) and maybe able to decode it. Thank you, > So what we want to (easily) find for that instruction is c::d. > > So given any instruction with a memop (either load or store) we want to > find: type::member. > > -- Masami Hiramatsu <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Additional debug info to aid cacheline analysis 2020-11-02 8:27 ` Masami Hiramatsu @ 2020-11-03 4:22 ` Namhyung Kim 0 siblings, 0 replies; 27+ messages in thread From: Namhyung Kim @ 2020-11-03 4:22 UTC (permalink / raw) To: Masami Hiramatsu Cc: Peter Zijlstra, Mark Wielaard, Stephane Eranian, linux-toolchains, Arnaldo Carvalho de Melo, linux-kernel, Ingo Molnar, Jiri Olsa, Ian Rogers, Phillips, Kim, Mark Rutland, Andi Kleen Hi Masami, On Mon, Nov 2, 2020 at 5:27 PM Masami Hiramatsu <mhiramat@kernel.org> wrote: > > Hi, > > On Fri, 30 Oct 2020 11:10:04 +0100 > Peter Zijlstra <peterz@infradead.org> wrote: > > > On Fri, Oct 30, 2020 at 10:16:49AM +0100, Mark Wielaard wrote: > > > Hi Namhyung, > > > > > > On Fri, Oct 30, 2020 at 02:26:19PM +0900, Namhyung Kim wrote: > > > > On Thu, Oct 8, 2020 at 6:38 PM Mark Wielaard <mark@klomp.org> wrote: > > > > > GCC using -fvar-tracking and -fvar-tracking-assignments is pretty good > > > > > at keeping track of where variables are held (in memory or registers) > > > > > when in the program, even through various optimizations. > > > > > > > > > > -fvar-tracking-assignments is the default with -g -O2. > > > > > Except for the upstream linux kernel code. Most distros enable it > > > > > again, but you do want to enable it by hand when building from the > > > > > upstream linux git repo. > > > > > > > > Please correct me if I'm wrong. This seems to track local variables. > > > > But I'm not sure it's enough for this purpose as we want to know > > > > types of any memory references (not directly from a variable). > > > > > > > > Let's say we have a variable like below: > > > > > > > > struct xxx a; > > > > > > > > a.b->c->d++; > > > > > > > > And we have a sample where 'd' is updated, then how can we know > > > > it's from the variable 'a'? Maybe we don't need to know it, but we > > > > should know it accesses the 'd' field in the struct 'c'. > > > > > > > > Probably we can analyze the asm code and figure out it's from 'a' > > > > and accessing 'd' at the moment. I'm curious if there's a way in > > > > the DWARF to help this kind of work. > > > > > > DWARF does have that information, but it stores it in a way that is > > > kind of opposite to how you want to access it. Given a variable and an > > > address, you can easily get the location where that variable is > > > stored. But if you want to map back from a given (memory) location and > > > address to the variable, that is more work. > > > > The principal idea in this thread doesn't care about the address of the > > variables. The idea was to get the data type and member information from > > the instruction. > > > > So in the above example: a.b->c->d++; what we'll end up with is > > something like: > > > > inc 8(%rax) > > > > Where %rax contains c, and the offset of d in c is 8. > > For this simple case, it is possible. > > This offset information is stored in the DWARF as a data-structure type > information. (perf-probe uses it to find how to get the given local var's > fields) > > So if we do this off-line, I think it is possible if it is recorded with > instruction-pointers. For each place, we can do > > - decode instruction and get the access address. > - get var assignment of %rax at that IP. > - get type information of var and find the field from offset. > > However, the problem is that if the DWARF has only assignment of "a", > we need to decode the function body. (and usually this happens) > > func() { > struct xxx a; > ... > a.b->c->d++; > } > > In this case, only "a" is the local variable. So DWARF records assignment of > "a", not "b" nor "c" (since those are not a name of variables, just a name > of fields). GCC may generate something like > > mov 16(%rsp),%rdx // rdx = a.b > mov 8(%rdx),%rax // rax = b->c > inc 8(%rax) // c->d++ Right, it'd be really nice if compiler can add information about the (hidden) assignments in the rdx and rax here. Thanks Namhyung ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2020-11-03 4:22 UTC | newest] Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-10-06 13:17 Additional debug info to aid cacheline analysis Peter Zijlstra 2020-10-06 19:00 ` Arnaldo Carvalho de Melo 2020-10-08 5:58 ` Stephane Eranian 2020-10-08 7:02 ` Peter Zijlstra 2020-10-08 9:32 ` Mark Wielaard 2020-10-08 21:23 ` Andi Kleen 2020-10-10 20:58 ` Mark Wielaard 2020-10-10 21:51 ` Mark Wielaard [not found] ` <20201010220712.5352-1-mark@klomp.org> 2020-10-10 22:21 ` [PATCH] Only add -fno-var-tracking-assignments workaround for old GCC versions Ian Rogers 2020-10-12 18:59 ` Nick Desaulniers 2020-10-12 19:12 ` Mark Wielaard 2020-10-14 15:31 ` Sedat Dilek 2020-10-14 11:01 ` Mark Wielaard 2020-10-14 15:17 ` Andi Kleen 2020-10-17 12:01 ` [PATCH V2] " Mark Wielaard 2020-10-19 19:30 ` Nick Desaulniers 2020-10-20 15:27 ` Masahiro Yamada 2020-10-10 22:33 ` [PATCH] " Mark Wielaard 2020-10-11 11:04 ` Additional debug info to aid cacheline analysis Segher Boessenkool 2020-10-11 12:15 ` Florian Weimer 2020-10-11 12:23 ` Mark Wielaard 2020-10-11 12:28 ` Florian Weimer 2020-10-30 5:26 ` Namhyung Kim 2020-10-30 9:16 ` Mark Wielaard 2020-10-30 10:10 ` Peter Zijlstra 2020-11-02 8:27 ` Masami Hiramatsu 2020-11-03 4:22 ` Namhyung Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).