All of lore.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@suse.de>
To: Nathan Chancellor <nathan@kernel.org>
Cc: Yin Fengwei <fengwei.yin@intel.com>,
	Carel Si <beibei.si@intel.com>, Joerg Roedel <jroedel@suse.de>,
	LKML <linux-kernel@vger.kernel.org>,
	x86@kernel.org, lkp@lists.01.org, lkp@intel.com,
	bfields@fieldses.org, llvm@lists.linux.dev,
	Nick Desaulniers <ndesaulniers@google.com>
Subject: Re: [LKP] Re: [x86/mm/64] f154f29085: BUG:kernel_reboot-without-warning_in_boot_stage - clang KCOV?
Date: Sat, 18 Dec 2021 12:00:16 +0100	[thread overview]
Message-ID: <Yb2/QCOExDEsj47w@zn.tnic> (raw)
In-Reply-To: <YbzRHXEMnZjyXzWa@archlinux-ax161>

On Fri, Dec 17, 2021 at 11:04:13AM -0700, Nathan Chancellor wrote:
> This is GCOV, -fprofile-arcs.

Aha, and no_profile_instrument_function disables that.

> I am not reallys ure how exactly GCOV works under the hood so I cannot
> really comment on it (Nick might); it seems like llvm_gcov_init needs to
> get called for __llvm_gcov_ctr to get set up properly and maybe that
> hasn't happened at the point.

Hmm, right, there is a

        .p2align        4, 0x90                         # -- Begin function __llvm_gcov_init
        .type   __llvm_gcov_init,@function
__llvm_gcov_init:                       # @__llvm_gcov_init

in arch/x86/kernel/cpu/common.s which contains native_write_cr4() which does

	jmp     llvm_gcov_init@PLT

and that function is part of initcalls:

	.section	.init_array.0,"aw",@init_array
	.p2align	3
	.quad	__llvm_gcov_init
	.section	.init_array.1,"aw",@init_array
	.p2align	3
	.quad	asan.module_ctor
	.section	.fini_array.1,"aw",@fini_array
	.p2align	3
	.quad	asan.module_dtor
	.type	.L___asan_gen_.313,@object      # @___asan_gen_.313
	.section	.rodata.str1.1,"aMS",@progbits,1

which, AFAICT, gets called by kernel_init_freeable->do_basic_setup->do_ctors()

which is a lot later than x86_64_start_kernel() so I guess the
__no_profile tag for that function probably makes sense as a fix.

> This is a bit of a brain dump, apologies for not offering much upfront
> analysis, I am not as familiar with LLVM internals as Nick but this
> might help others look into the problem.

No, this is still highly appreciated - thanks for taking the time!

> I ended up seeing this thread yesterday through a lore filter that I
> have

Nice filtering. :-)

I did Cc llvm@lists.linux.dev in the hope that you guys would see it.

> and I bisected LLVM based on the fact that it happened with
> clang-13 but not clang-12; that bisect pointed out Nick's commit in LLVM
> that added the no profile attribute, which means that GCOV and KASAN
> need to be enabled to see this bug. I was not able to reproduce it with
> just one of them enabled at a time.

Ah, that's an interesting point:

$ grep -E "(GCOV|KASAN)" /tmp/config-5.16.0-rc3-00003-gf154f290855b | grep -v "^#"
CONFIG_KASAN_SHADOW_OFFSET=0xdffffc0000000000
CONFIG_GCOV_KERNEL=y
CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
CONFIG_GCOV_PROFILE_ALL=y
CONFIG_HAVE_ARCH_KASAN=y
CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
CONFIG_CC_HAS_KASAN_GENERIC=y
CONFIG_CC_HAS_KASAN_SW_TAGS=y
CONFIG_KASAN=y
CONFIG_KASAN_GENERIC=y
CONFIG_KASAN_INLINE=y
CONFIG_KASAN_VMALLOC=y

> With that, I removed the no profile attribute dependency on GCOV_KERNEL
> and bisected again, landing on the commit in LLVM 13 that enables the
> new pass manager, which fundamentally changes how LLVM transforms its
> IR. Whenever that has happened in the past, it usually points to a
> pre-existing issue; if I go back to clang-11 (the current minimum of
> -next) and enable the NPM there with -fexperimental-new-pass-manager, I
> see this hang so it seems like there might be some issue how GCOV and
> KASAN are manipulated together in the context of the NPM that was not
> present with the legacy pass manager.

Aha, so it used to work, apparently. Or the NPM is adding additional
code which needs to be initialized because it works ok after the
constructors have run.

> I do see tests in LLVM that test to make sure __llvm_gcov_ctr does not
> get instrumented with KASAN, maybe there is another interaction that
> should not be happening between those two?

Right.

Ok, thanks for the insights, let's see what Nick figures out here.

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Ivo Totev, HRB 36809, AG Nürnberg

WARNING: multiple messages have this Message-ID (diff)
From: Borislav Petkov <bp@suse.de>
To: lkp@lists.01.org
Subject: Re: [x86/mm/64] f154f29085: BUG:kernel_reboot-without-warning_in_boot_stage - clang KCOV?
Date: Sat, 18 Dec 2021 12:00:16 +0100	[thread overview]
Message-ID: <Yb2/QCOExDEsj47w@zn.tnic> (raw)
In-Reply-To: <YbzRHXEMnZjyXzWa@archlinux-ax161>

[-- Attachment #1: Type: text/plain, Size: 3775 bytes --]

On Fri, Dec 17, 2021 at 11:04:13AM -0700, Nathan Chancellor wrote:
> This is GCOV, -fprofile-arcs.

Aha, and no_profile_instrument_function disables that.

> I am not reallys ure how exactly GCOV works under the hood so I cannot
> really comment on it (Nick might); it seems like llvm_gcov_init needs to
> get called for __llvm_gcov_ctr to get set up properly and maybe that
> hasn't happened at the point.

Hmm, right, there is a

        .p2align        4, 0x90                         # -- Begin function __llvm_gcov_init
        .type   __llvm_gcov_init,@function
__llvm_gcov_init:                       # @__llvm_gcov_init

in arch/x86/kernel/cpu/common.s which contains native_write_cr4() which does

	jmp     llvm_gcov_init(a)PLT

and that function is part of initcalls:

	.section	.init_array.0,"aw",@init_array
	.p2align	3
	.quad	__llvm_gcov_init
	.section	.init_array.1,"aw",@init_array
	.p2align	3
	.quad	asan.module_ctor
	.section	.fini_array.1,"aw",@fini_array
	.p2align	3
	.quad	asan.module_dtor
	.type	.L___asan_gen_.313,@object      # @___asan_gen_.313
	.section	.rodata.str1.1,"aMS",@progbits,1

which, AFAICT, gets called by kernel_init_freeable->do_basic_setup->do_ctors()

which is a lot later than x86_64_start_kernel() so I guess the
__no_profile tag for that function probably makes sense as a fix.

> This is a bit of a brain dump, apologies for not offering much upfront
> analysis, I am not as familiar with LLVM internals as Nick but this
> might help others look into the problem.

No, this is still highly appreciated - thanks for taking the time!

> I ended up seeing this thread yesterday through a lore filter that I
> have

Nice filtering. :-)

I did Cc llvm(a)lists.linux.dev in the hope that you guys would see it.

> and I bisected LLVM based on the fact that it happened with
> clang-13 but not clang-12; that bisect pointed out Nick's commit in LLVM
> that added the no profile attribute, which means that GCOV and KASAN
> need to be enabled to see this bug. I was not able to reproduce it with
> just one of them enabled at a time.

Ah, that's an interesting point:

$ grep -E "(GCOV|KASAN)" /tmp/config-5.16.0-rc3-00003-gf154f290855b | grep -v "^#"
CONFIG_KASAN_SHADOW_OFFSET=0xdffffc0000000000
CONFIG_GCOV_KERNEL=y
CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
CONFIG_GCOV_PROFILE_ALL=y
CONFIG_HAVE_ARCH_KASAN=y
CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
CONFIG_CC_HAS_KASAN_GENERIC=y
CONFIG_CC_HAS_KASAN_SW_TAGS=y
CONFIG_KASAN=y
CONFIG_KASAN_GENERIC=y
CONFIG_KASAN_INLINE=y
CONFIG_KASAN_VMALLOC=y

> With that, I removed the no profile attribute dependency on GCOV_KERNEL
> and bisected again, landing on the commit in LLVM 13 that enables the
> new pass manager, which fundamentally changes how LLVM transforms its
> IR. Whenever that has happened in the past, it usually points to a
> pre-existing issue; if I go back to clang-11 (the current minimum of
> -next) and enable the NPM there with -fexperimental-new-pass-manager, I
> see this hang so it seems like there might be some issue how GCOV and
> KASAN are manipulated together in the context of the NPM that was not
> present with the legacy pass manager.

Aha, so it used to work, apparently. Or the NPM is adding additional
code which needs to be initialized because it works ok after the
constructors have run.

> I do see tests in LLVM that test to make sure __llvm_gcov_ctr does not
> get instrumented with KASAN, maybe there is another interaction that
> should not be happening between those two?

Right.

Ok, thanks for the insights, let's see what Nick figures out here.

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Ivo Totev, HRB 36809, AG Nürnberg

  reply	other threads:[~2021-12-18 11:00 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-09 14:41 [x86/mm/64] f154f29085: BUG:kernel_reboot-without-warning_in_boot_stage kernel test robot
2021-12-09 14:41 ` kernel test robot
2021-12-14 16:38 ` Borislav Petkov
2021-12-14 16:38   ` Borislav Petkov
2021-12-15  7:00   ` [LKP] " Carel Si
2021-12-15  7:00     ` Carel Si
2021-12-15 10:05     ` [LKP] " Borislav Petkov
2021-12-15 10:05       ` Borislav Petkov
2021-12-16  7:04       ` [LKP] " Yin Fengwei
2021-12-16  7:04         ` Yin Fengwei
2021-12-16 10:06         ` [LKP] " Borislav Petkov
2021-12-16 10:06           ` Borislav Petkov
2021-12-16 11:58           ` [LKP] " Carel Si
2021-12-16 11:58             ` Carel Si
2021-12-16 12:21             ` [LKP] " Yin Fengwei
2021-12-16 12:21               ` Yin Fengwei
2021-12-17 12:52               ` [LKP] Re: [x86/mm/64] f154f29085: BUG:kernel_reboot-without-warning_in_boot_stage - clang KCOV? Borislav Petkov
2021-12-17 12:52                 ` Borislav Petkov
2021-12-17 18:04                 ` [LKP] " Nathan Chancellor
2021-12-17 18:04                   ` Nathan Chancellor
2021-12-18 11:00                   ` Borislav Petkov [this message]
2021-12-18 11:00                     ` Borislav Petkov
2021-12-20 11:00                     ` [PATCH] x86/mm: Prevent early boot triple-faults with instrumentation Borislav Petkov
2021-12-20 11:00                       ` Borislav Petkov
2021-12-18 10:39                 ` [LKP] Re: [x86/mm/64] f154f29085: BUG:kernel_reboot-without-warning_in_boot_stage - clang KCOV? Yin Fengwei
2021-12-18 10:39                   ` Yin Fengwei
2021-12-18 11:01                   ` [LKP] " Borislav Petkov
2021-12-18 11:01                     ` Borislav Petkov
2021-12-20  1:51                     ` [LKP] " Yin Fengwei
2021-12-20  1:51                       ` Yin Fengwei
2021-12-21 14:31                 ` [LKP] " Carel Si
2021-12-21 14:31                   ` Carel Si
2021-12-21 15:10                   ` [LKP] " Marco Elver
2021-12-21 15:10                     ` Marco Elver
2021-12-21 15:22                     ` [LKP] " Borislav Petkov
2021-12-21 15:22                       ` Borislav Petkov
2022-01-05  2:35                       ` [LKP] " Yin Fengwei
2022-01-05  2:35                         ` Yin Fengwei
2022-01-05 11:36                         ` [LKP] " Borislav Petkov
2022-01-05 11:36                           ` Borislav Petkov
2022-01-05 12:47                           ` [LKP] " Yin Fengwei
2022-01-05 12:47                             ` Yin Fengwei
2022-01-05 15:21                             ` [LKP] " Borislav Petkov
2022-01-05 15:21                               ` Borislav Petkov
2022-01-06  6:56                               ` [LKP] " Yin Fengwei
2022-01-06  6:56                                 ` Yin Fengwei
2021-12-21 15:14                   ` [LKP] " Borislav Petkov
2021-12-21 15:14                     ` Borislav Petkov
2021-12-22 10:59 ` [tip: x86/mm] x86/mm: Prevent early boot triple-faults with instrumentation tip-bot2 for Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yb2/QCOExDEsj47w@zn.tnic \
    --to=bp@suse.de \
    --cc=beibei.si@intel.com \
    --cc=bfields@fieldses.org \
    --cc=fengwei.yin@intel.com \
    --cc=jroedel@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=lkp@lists.01.org \
    --cc=llvm@lists.linux.dev \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.