linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* "Inconsistent kallsyms data" error
@ 2012-07-05 21:18 Linus Torvalds
  2012-07-06  7:25 ` Jan Beulich
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Linus Torvalds @ 2012-07-05 21:18 UTC (permalink / raw)
  To: Sam Ravnborg, Michal Marek, Arnaud Lacombe, Nick Bowler, Jan Beulich
  Cc: Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 2684 bytes --]

So for some unknown reason I'm hitting this on just one particular
machine, and it's *very* annoying.

It's annoying for three reasons:

 - it's breaking the build (duh)

 - the error is printed out to stderr, so you don't even *see* it as
an error if you redirect the normal messages somewhere else (like any
sane person, ie me, does)

 - when the error happens, it doesn't show *what* went wrong, and in
fact it explicitly cleans up all the files that could show what
happened.

And no, "make KALLSYMS_EXTRA_PASS=1" does not fix anything.

Interestingly, making a trivial change to actually show the difference
actually made the problem go away. It was entirely reliable with that
particular config and that particular kernel version with a *clean*
tree, but it looks like just changing the tree to be dirty (and thus
changing the version string) hides the problem. Which makes it even
harder to debug, because now I can't see what the difference actually
is that causes things to fail.

VERY annoying.

This is not a new bug - according to google this has been reported
before, back in October 2011. In that case the workaround worked. In
my case it does not.

Anyway, after hacking the source to actually show the difference, and
to also *not* change the version string just becuse it's dirty, I see
this difference:

 - System.map:

    ...
    ffffffff8189b4d0 R kallsyms_addresses
    ffffffff818ee910 R kallsyms_num_syms
    ffffffff818ee918 R kallsyms_names
    ...
    ffffffff819fa9a0 R __stop___modver
    ffffffff819fb000 R __end_rodata
    ...

 - .tmp_System.map:

    ...
    ffffffff8189b4d0 R kallsyms_addresses
    ffffffff818ee850 R kallsyms_num_syms
    ffffffff818ee858 R kallsyms_names
    ...
    ffffffff819fa720 R __stop___modver
    ffffffff819fb000 R __end_rodata

(the diff itself is huge, because once the addresses change, they stay
different).

Notice how 'kallsyms_addresses' has the same value, but
'kallsyms_num_syms' (and subsequent symbols until the page-aligned
__end_rodata symbol that gets them back in sync) do not. I have no
idea *why* this happens, but it definitely does.

It seems the real difference is the size of the "kallsyms_addresses"
data structure. No idea why, though.

This happens with current git (commit c4aed353b1b0), on an x86-64
machine running current F17 as of today, with the attached config.
Maybe that makes somebody else able to recreate this and figure out
what is so magical about the layout that the exact kernel version and
config (and likely compiler/binutils versions) matter.

Any ideas? Added a fairly random set of people who get mentioned in
the linker script commits etc.

                           Linus

[-- Attachment #2: tove-config.gz --]
[-- Type: application/x-gzip, Size: 17487 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: "Inconsistent kallsyms data" error
  2012-07-05 21:18 "Inconsistent kallsyms data" error Linus Torvalds
@ 2012-07-06  7:25 ` Jan Beulich
  2012-07-06 11:17 ` Paulo Marques
  2012-07-07 21:40 ` Michal Marek
  2 siblings, 0 replies; 8+ messages in thread
From: Jan Beulich @ 2012-07-06  7:25 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nick Bowler, Arnaud Lacombe, Sam Ravnborg, Michal Marek,
	Linux Kernel Mailing List

>>> On 05.07.12 at 23:18, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> Anyway, after hacking the source to actually show the difference, and
> to also *not* change the version string just becuse it's dirty, I see
> this difference:
> 
>  - System.map:
> 
>     ...
>     ffffffff8189b4d0 R kallsyms_addresses
>     ffffffff818ee910 R kallsyms_num_syms
>     ffffffff818ee918 R kallsyms_names
>     ...
>     ffffffff819fa9a0 R __stop___modver
>     ffffffff819fb000 R __end_rodata
>     ...
> 
>  - .tmp_System.map:
> 
>     ...
>     ffffffff8189b4d0 R kallsyms_addresses
>     ffffffff818ee850 R kallsyms_num_syms
>     ffffffff818ee858 R kallsyms_names
>     ...
>     ffffffff819fa720 R __stop___modver
>     ffffffff819fb000 R __end_rodata
> 
> (the diff itself is huge, because once the addresses change, they stay
> different).
> 
> Notice how 'kallsyms_addresses' has the same value, but
> 'kallsyms_num_syms' (and subsequent symbols until the page-aligned
> __end_rodata symbol that gets them back in sync) do not. I have no
> idea *why* this happens, but it definitely does.
> 
> It seems the real difference is the size of the "kallsyms_addresses"
> data structure. No idea why, though.

Since it's clearly not an alignment problem, it almost certainly
means there were symbols added in the second pass, when
none were expected to be added.

> This happens with current git (commit c4aed353b1b0), on an x86-64
> machine running current F17 as of today, with the attached config.
> Maybe that makes somebody else able to recreate this and figure out
> what is so magical about the layout that the exact kernel version and
> config (and likely compiler/binutils versions) matter.
> 
> Any ideas? Added a fairly random set of people who get mentioned in
> the linker script commits etc.

I would have asked you to tar up all files in the build tree root
(if you still have them, ideally including the ones you saved from
deletion; quite possibly other object files in the tree might
subsequently need looking at too, so just taking the full tree
would probably be best), but since I'll be on vacation for two
weeks starting this evening that would probably not be of much
immediate help.

In the unlikely event that the problem remains unsolved till then,
I would still offer to take a look, not the least because the mere
presence of the KALLSYMS_EXTRA_PASS workaround always
puzzled me. (With reproduction unfortunately being so fragile, 
I'm having not much hope to be able to recreate the issue
myself.)

Jan


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: "Inconsistent kallsyms data" error
  2012-07-05 21:18 "Inconsistent kallsyms data" error Linus Torvalds
  2012-07-06  7:25 ` Jan Beulich
@ 2012-07-06 11:17 ` Paulo Marques
  2012-07-07  4:48   ` Paul Gortmaker
  2012-07-07 21:40 ` Michal Marek
  2 siblings, 1 reply; 8+ messages in thread
From: Paulo Marques @ 2012-07-06 11:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sam Ravnborg, Michal Marek, Arnaud Lacombe, Nick Bowler,
	Jan Beulich, Linux Kernel Mailing List

Linus Torvalds wrote:
> [...]
> Notice how 'kallsyms_addresses' has the same value, but
> 'kallsyms_num_syms' (and subsequent symbols until the page-aligned
> __end_rodata symbol that gets them back in sync) do not. I have no
> idea *why* this happens, but it definitely does.
> 
> It seems the real difference is the size of the "kallsyms_addresses"
> data structure. No idea why, though.
> 
> This happens with current git (commit c4aed353b1b0), on an x86-64
> machine running current F17 as of today, with the attached config.
> Maybe that makes somebody else able to recreate this and figure out
> what is so magical about the layout that the exact kernel version and
> config (and likely compiler/binutils versions) matter.
> 
> Any ideas? Added a fairly random set of people who get mentioned in
> the linker script commits etc.

Since kallsyms_addresses seems to change size, this means that there
were symbols added in the second pass.

In the past, this usually happened when some symbols are near a section
boundary and the alignment makes them included or excluded from the
kernel symbol tables.

There was a recent thread from David Brown on the arm linux mailing list
("ARM: two possible fixes for the KALLSYMS build problem"). He tracked
down the problem to having empty per_cpu sections on a non-smp build.
The alignment of these sections made the symbols jump around and change
from one build to the next. This particular problem might be ARM
specific, though.

-- 
Paulo Marques - www.grupopie.com

"I used to be indecisive, but now I'm not so sure."

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: "Inconsistent kallsyms data" error
  2012-07-06 11:17 ` Paulo Marques
@ 2012-07-07  4:48   ` Paul Gortmaker
  2012-07-15 21:58     ` Domenico Andreoli
  0 siblings, 1 reply; 8+ messages in thread
From: Paul Gortmaker @ 2012-07-07  4:48 UTC (permalink / raw)
  To: Paulo Marques
  Cc: Linus Torvalds, Sam Ravnborg, Michal Marek, Arnaud Lacombe,
	Nick Bowler, Jan Beulich, Linux Kernel Mailing List

On Fri, Jul 6, 2012 at 7:17 AM, Paulo Marques <pmarques@grupopie.com> wrote:

[...]

>
> There was a recent thread from David Brown on the arm linux mailing list
> ("ARM: two possible fixes for the KALLSYMS build problem"). He tracked
> down the problem to having empty per_cpu sections on a non-smp build.

Actually rmk diagnosed it as the empty per_cpu sections.  See it here:

http://marc.info/?l=linux-next&m=133267456809502&w=2

Paul.
--

> The alignment of these sections made the symbols jump around and change
> from one build to the next. This particular problem might be ARM
> specific, though.
>
> --
> Paulo Marques - www.grupopie.com
>
> "I used to be indecisive, but now I'm not so sure."
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: "Inconsistent kallsyms data" error
  2012-07-05 21:18 "Inconsistent kallsyms data" error Linus Torvalds
  2012-07-06  7:25 ` Jan Beulich
  2012-07-06 11:17 ` Paulo Marques
@ 2012-07-07 21:40 ` Michal Marek
  2012-08-10  0:02   ` Jan Engelhardt
  2 siblings, 1 reply; 8+ messages in thread
From: Michal Marek @ 2012-07-07 21:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sam Ravnborg, Arnaud Lacombe, Nick Bowler, Jan Beulich,
	Linux Kernel Mailing List

On Thu, Jul 05, 2012 at 02:18:18PM -0700, Linus Torvalds wrote:
> So for some unknown reason I'm hitting this on just one particular
> machine, and it's *very* annoying.
> 
> It's annoying for three reasons:
> 
>  - it's breaking the build (duh)
> 
>  - the error is printed out to stderr, so you don't even *see* it as
> an error if you redirect the normal messages somewhere else (like any
> sane person, ie me, does)

I'm committing the attached patch to kbuild.git#kbuild.


>  - when the error happens, it doesn't show *what* went wrong, and in
> fact it explicitly cleans up all the files that could show what
> happened.

Right, the files need to be preserved somehow. At the same
time, we can't leave the final files there, to not mark the link as
successful. I will have a look.

No idea about the actual kallsyms bug yet, sorry.

Michal


>From 5369f55021feb27a1481267e7afefe14128d669f Mon Sep 17 00:00:00 2001
From: Michal Marek <mmarek@suse.cz>
Date: Sat, 7 Jul 2012 23:04:40 +0200
Subject: [PATCH] kbuild: Print errors to stderr

... at least in the top-level Makefile and scripts/link-vmlinux.sh.
There are some more instances of the 'echo <error>; exit 1' pattern in
some arch Makefiles and kconfig.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
---
 Makefile                |   24 ++++++++++++------------
 scripts/link-vmlinux.sh |    4 ++--
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/Makefile b/Makefile
index 0d718ed..d7d7949 100644
--- a/Makefile
+++ b/Makefile
@@ -535,11 +535,11 @@ PHONY += include/config/auto.conf
 
 include/config/auto.conf:
 	$(Q)test -e include/generated/autoconf.h -a -e $@ || (		\
-	echo;								\
-	echo "  ERROR: Kernel configuration is invalid.";		\
-	echo "         include/generated/autoconf.h or $@ are missing.";\
-	echo "         Run 'make oldconfig && make prepare' on kernel src to fix it.";	\
-	echo;								\
+	echo >&2;							\
+	echo >&2 "  ERROR: Kernel configuration is invalid.";		\
+	echo >&2 "         include/generated/autoconf.h or $@ are missing.";\
+	echo >&2 "         Run 'make oldconfig && make prepare' on kernel src to fix it.";	\
+	echo >&2 ;							\
 	/bin/false)
 
 endif # KBUILD_EXTMOD
@@ -796,8 +796,8 @@ prepare3: include/config/kernel.release
 ifneq ($(KBUILD_SRC),)
 	@$(kecho) '  Using $(srctree) as source for kernel'
 	$(Q)if [ -f $(srctree)/.config -o -d $(srctree)/include/config ]; then \
-		echo "  $(srctree) is not clean, please run 'make mrproper'"; \
-		echo "  in the '$(srctree)' directory.";\
+		echo >&2 "  $(srctree) is not clean, please run 'make mrproper'"; \
+		echo >&2 "  in the '$(srctree)' directory.";\
 		/bin/false; \
 	fi;
 endif
@@ -971,11 +971,11 @@ else # CONFIG_MODULES
 # ---------------------------------------------------------------------------
 
 modules modules_install: FORCE
-	@echo
-	@echo "The present kernel configuration has modules disabled."
-	@echo "Type 'make config' and enable loadable module support."
-	@echo "Then build a kernel with module support enabled."
-	@echo
+	@echo >&2
+	@echo >&2 "The present kernel configuration has modules disabled."
+	@echo >&2 "Type 'make config' and enable loadable module support."
+	@echo >&2 "Then build a kernel with module support enabled."
+	@echo >&2
 	@exit 1
 
 endif # CONFIG_MODULES
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index cd9c6c6..4629038 100644
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -210,8 +210,8 @@ if [ -n "${CONFIG_KALLSYMS}" ]; then
 	mksysmap ${kallsyms_vmlinux} .tmp_System.map
 
 	if ! cmp -s System.map .tmp_System.map; then
-		echo Inconsistent kallsyms data
-		echo echo Try "make KALLSYMS_EXTRA_PASS=1" as a workaround
+		echo >&2 Inconsistent kallsyms data
+		echo >&2 echo Try "make KALLSYMS_EXTRA_PASS=1" as a workaround
 		cleanup
 		exit 1
 	fi

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: "Inconsistent kallsyms data" error
  2012-07-07  4:48   ` Paul Gortmaker
@ 2012-07-15 21:58     ` Domenico Andreoli
  0 siblings, 0 replies; 8+ messages in thread
From: Domenico Andreoli @ 2012-07-15 21:58 UTC (permalink / raw)
  To: Paul Gortmaker
  Cc: Paulo Marques, Linus Torvalds, Sam Ravnborg, Michal Marek,
	Arnaud Lacombe, Nick Bowler, Jan Beulich,
	Linux Kernel Mailing List

On Sat, Jul 07, 2012 at 12:48:23AM -0400, Paul Gortmaker wrote:
> On Fri, Jul 6, 2012 at 7:17 AM, Paulo Marques <pmarques@grupopie.com> wrote:
> 
> [...]
> 
> >
> > There was a recent thread from David Brown on the arm linux mailing list
> > ("ARM: two possible fixes for the KALLSYMS build problem"). He tracked
> > down the problem to having empty per_cpu sections on a non-smp build.
> 
> Actually rmk diagnosed it as the empty per_cpu sections.  See it here:
> 
> http://marc.info/?l=linux-next&m=133267456809502&w=2

I also saw this problem in the past (a couple of months ago), mostly on a
specific machine. But I _am_ playing with the linker scripts so I expected
to be the cause in an unknown way.

Unfortunately I've never understood the cause (I also switched from reiserfs
to ext4 in the panic, moving the source tree around seemed to help). The
behaviour seemed inconsistent and could not reliably reproduce it even if
there were very few "moving targets". Very clueless.

Don't know why I'm not encountering it any more but a more useful diagnostic
would be surely useful.  The second pass thing didn't help me either. I
also tried to implement an addition pass but the thing was not making any
sense to me anyway.

So yes, in case of error please leave some trace somewhere so that to poor
user has some chance to work his/her good luck out of the troubles.

cheers,
Domenico

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: "Inconsistent kallsyms data" error
  2012-07-07 21:40 ` Michal Marek
@ 2012-08-10  0:02   ` Jan Engelhardt
  2012-08-10  9:59     ` Michal Marek
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Engelhardt @ 2012-08-10  0:02 UTC (permalink / raw)
  To: Michal Marek
  Cc: Linus Torvalds, Sam Ravnborg, Arnaud Lacombe, Nick Bowler,
	Jan Beulich, Linux Kernel Mailing List


On Saturday 2012-07-07 23:40, Michal Marek wrote:
>index cd9c6c6..4629038 100644
>--- a/scripts/link-vmlinux.sh
>+++ b/scripts/link-vmlinux.sh
>@@ -210,8 +210,8 @@ if [ -n "${CONFIG_KALLSYMS}" ]; then
> 	mksysmap ${kallsyms_vmlinux} .tmp_System.map
> 
> 	if ! cmp -s System.map .tmp_System.map; then
>-		echo Inconsistent kallsyms data
>-		echo echo Try "make KALLSYMS_EXTRA_PASS=1" as a workaround
>+		echo >&2 Inconsistent kallsyms data
>+		echo >&2 echo Try "make KALLSYMS_EXTRA_PASS=1" as a workaround

Hm why is there echo twice in that one line? Seems like an oversight..

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: "Inconsistent kallsyms data" error
  2012-08-10  0:02   ` Jan Engelhardt
@ 2012-08-10  9:59     ` Michal Marek
  0 siblings, 0 replies; 8+ messages in thread
From: Michal Marek @ 2012-08-10  9:59 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: Linus Torvalds, Sam Ravnborg, Arnaud Lacombe, Nick Bowler,
	Jan Beulich, Linux Kernel Mailing List

On Fri, Aug 10, 2012 at 02:02:33AM +0200, Jan Engelhardt wrote:
> 
> On Saturday 2012-07-07 23:40, Michal Marek wrote:
> >index cd9c6c6..4629038 100644
> >--- a/scripts/link-vmlinux.sh
> >+++ b/scripts/link-vmlinux.sh
> >@@ -210,8 +210,8 @@ if [ -n "${CONFIG_KALLSYMS}" ]; then
> > 	mksysmap ${kallsyms_vmlinux} .tmp_System.map
> > 
> > 	if ! cmp -s System.map .tmp_System.map; then
> >-		echo Inconsistent kallsyms data
> >-		echo echo Try "make KALLSYMS_EXTRA_PASS=1" as a workaround
> >+		echo >&2 Inconsistent kallsyms data
> >+		echo >&2 echo Try "make KALLSYMS_EXTRA_PASS=1" as a workaround
> 
> Hm why is there echo twice in that one line? Seems like an oversight..

Good catch.

>From 367e43c50d7f7c3b0cec17f4d855a96f47f5e17b Mon Sep 17 00:00:00 2001
From: Michal Marek <mmarek@suse.cz>
Date: Fri, 10 Aug 2012 11:55:11 +0200
Subject: [PATCH] link-vmlinux.sh: Fix stray "echo" in error message

Reported-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Michal Marek <mmarek@suse.cz>
---
 scripts/link-vmlinux.sh |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 4629038..4235a63 100644
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -211,7 +211,7 @@ if [ -n "${CONFIG_KALLSYMS}" ]; then
 
 	if ! cmp -s System.map .tmp_System.map; then
 		echo >&2 Inconsistent kallsyms data
-		echo >&2 echo Try "make KALLSYMS_EXTRA_PASS=1" as a workaround
+		echo >&2 Try "make KALLSYMS_EXTRA_PASS=1" as a workaround
 		cleanup
 		exit 1
 	fi

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-08-10  9:59 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-05 21:18 "Inconsistent kallsyms data" error Linus Torvalds
2012-07-06  7:25 ` Jan Beulich
2012-07-06 11:17 ` Paulo Marques
2012-07-07  4:48   ` Paul Gortmaker
2012-07-15 21:58     ` Domenico Andreoli
2012-07-07 21:40 ` Michal Marek
2012-08-10  0:02   ` Jan Engelhardt
2012-08-10  9:59     ` Michal Marek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).