linux-sparse.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Guenter Roeck <linux@roeck-us.net>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
	x86@kernel.org, Linus Torvalds <torvalds@linuxfoundation.org>,
	Uros Bizjak <ubizjak@gmail.com>,
	linux-sparse@vger.kernel.org, lkp@intel.com,
	oe-kbuild-all@lists.linux.dev
Subject: Re: [patch 5/9] x86: Cure per CPU madness on UP
Date: Fri, 15 Mar 2024 09:17:14 -0700	[thread overview]
Message-ID: <e20d88d0-5fb9-4307-be67-88b04ae9a188@roeck-us.net> (raw)
In-Reply-To: <20240304005104.622511517@linutronix.de>

Hi,

On Mon, Mar 04, 2024 at 11:12:23AM +0100, Thomas Gleixner wrote:
> On UP builds sparse complains rightfully about accesses to cpu_info with
> per CPU accessors:
> 
> cacheinfo.c:282:30: sparse: warning: incorrect type in initializer (different address spaces)
> cacheinfo.c:282:30: sparse:    expected void const [noderef] __percpu *__vpp_verify
> cacheinfo.c:282:30: sparse:    got unsigned int *
> 
> The reason is that on UP builds cpu_info which is a per CPU variable on SMP
> is mapped to boot_cpu_info which is a regular variable. There is a hideous
> accessor cpu_data() which tries to hide this, but it's not sufficient as
> some places require raw accessors and generates worse code than the regular
> per CPU accessors.
> 
> Waste sizeof(struct x86_cpuinfo) memory on UP and provide the per CPU
> cpu_info unconditionally. This requires to update the CPU info on the boot
> CPU as SMP does. (Ab)use the weakly defined smp_prepare_boot_cpu() function
> and implement exactly that.
> 
> This allows to use regular per CPU accessors uncoditionally and paves the
> way to remove the cpu_data() hackery.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

This patch results in crashes when running the mainline kernel in qemu
with nosmp builds and Intel CPUs. The problem is _not_ seen on tag
x86-cleanups-2024-03-11; it is only seen in the mainline kernel. I didn't
check all of them, but it looks like AMD CPUs are not affected. The
initial bisect points to the merge of x86-cleanups-2024-03-11 into the
mainline kernel. I rebased x86-cleanups-2024-03-11 on top of the mainline
kernel; the second bisect uses that rebase as base. Reverting this patch
from the mainline kernel fixes the problem.

I don't know the code well enough to determine what is wrong.
Please let me know what I can do to help debugging the problem.

Thanks,
Guenter

----
crash log:

[    3.291087] BUG: unable to handle page fault for address: ffff9cd801f3f2a0
[    3.291087] #PF: supervisor write access in kernel mode
[    3.291087] #PF: error_code(0x0002) - not-present page
[    3.291087] PGD 60201067 P4D 60201067 PUD 0
[    3.291087] Oops: 0002 [#1] PREEMPT PTI
[    3.291087] CPU: 0 PID: 1 Comm: swapper Not tainted 6.8.0-06619-ge5e038b7ae9d #1
[    3.291087] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[    3.291087] RIP: 0010:rapl_cpu_online+0xf2/0x110
[    3.291087] Code: 05 ff 8e 07 03 40 42 0f 00 48 89 43 60 e8 56 5f 12 00 8b 15 b4 84 61 02 48 8b 05 01 8f 07 03 48 c7 83 90 00 00 00 e0 84 80 b6 <48> 89 9c d0 38 01 00 00 e9 2b ff ff ff b8 f4 ff ff ff e9 47 ff ff
[    3.291087] RSP: 0018:ffffa3d54001fdd0 EFLAGS: 00000246
[    3.291087] RAX: ffff9cd001f3f200 RBX: ffff9cd001fb34a8 RCX: 0000000000000000
[    3.291087] RDX: 00000000ffffffed RSI: 0000000000000001 RDI: ffff9cd001fb3550
[    3.291087] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000001
[    3.291087] R10: 0000000000000001 R11: 0000000000018001 R12: 0000000000000000
[    3.291087] R13: 000000000000009e R14: ffffffffb6808180 R15: ffffffffb86710e5
[    3.291087] FS:  0000000000000000(0000) GS:ffffffffb8ab0000(0000) knlGS:0000000000000000
[    3.291087] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.291087] CR2: ffff9cd801f3f2a0 CR3: 000000005e6a2000 CR4: 00000000001506f0
[    3.291087] Call Trace:
[    3.291087]  <TASK>
[    3.291087]  ? __die+0x1f/0x60
[    3.291087]  ? page_fault_oops+0x148/0x460
[    3.291087]  ? search_exception_tables+0x37/0x50
[    3.291087]  ? fixup_exception+0x21/0x320
[    3.291087]  ? exc_page_fault+0xca/0x150
[    3.291087]  ? asm_exc_page_fault+0x26/0x30
[    3.291087]  ? __pfx_rapl_cpu_online+0x10/0x10
[    3.291087]  ? rapl_cpu_online+0xf2/0x110
[    3.291087]  cpuhp_invoke_callback.constprop.0+0x117/0x3e0
[    3.291087]  __cpuhp_setup_state_cpuslocked+0x1b7/0x280
[    3.291087]  ? __pfx_rapl_cpu_online+0x10/0x10
[    3.291087]  rapl_pmu_init+0x189/0x2e0
[    3.291087]  ? __pfx_rapl_pmu_init+0x10/0x10
[    3.291087]  do_one_initcall+0x4f/0x210
[    3.291087]  kernel_init_freeable+0x166/0x290
[    3.291087]  ? __pfx_kernel_init+0x10/0x10
[    3.291087]  kernel_init+0x15/0x1b0
[    3.291087]  ret_from_fork+0x2f/0x50
[    3.291087]  ? __pfx_kernel_init+0x10/0x10
[    3.291087]  ret_from_fork_asm+0x19/0x30
[    3.291087]  </TASK>
[    3.291087] Modules linked in:
[    3.291087] CR2: ffff9cd801f3f2a0
[    3.291087] ---[ end trace 0000000000000000 ]---
[    3.291087] RIP: 0010:rapl_cpu_online+0xf2/0x110
[    3.291087] Code: 05 ff 8e 07 03 40 42 0f 00 48 89 43 60 e8 56 5f 12 00 8b 15 b4 84 61 02 48 8b 05 01 8f 07 03 48 c7 83 90 00 00 00 e0 84 80 b6 <48> 89 9c d0 38 01 00 00 e9 2b ff ff ff b8 f4 ff ff ff e9 47 ff ff
[    3.291087] RSP: 0018:ffffa3d54001fdd0 EFLAGS: 00000246
[    3.291087] RAX: ffff9cd001f3f200 RBX: ffff9cd001fb34a8 RCX: 0000000000000000
[    3.291087] RDX: 00000000ffffffed RSI: 0000000000000001 RDI: ffff9cd001fb3550
[    3.291087] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000001
[    3.291087] R10: 0000000000000001 R11: 0000000000018001 R12: 0000000000000000
[    3.291087] R13: 000000000000009e R14: ffffffffb6808180 R15: ffffffffb86710e5
[    3.291087] FS:  0000000000000000(0000) GS:ffffffffb8ab0000(0000) knlGS:0000000000000000
[    3.291087] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.291087] CR2: ffff9cd801f3f2a0 CR3: 000000005e6a2000 CR4: 00000000001506f0
[    3.291087] note: swapper[1] exited with irqs disabled
[    3.306047] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009

---
1st bisect:

# bad: [1bbeaf83dd7b5e3628b98bec66ff8fe2646e14aa] Merge tag 'perf-tools-for-v6.9-2024-03-13' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
# good: [e8f897f4afef0031fe618a8e94127a0934896aba] Linux 6.8
git bisect start 'HEAD' 'v6.8'
# bad: [9187210eee7d87eea37b45ea93454a88681894a4] Merge tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
git bisect bad 9187210eee7d87eea37b45ea93454a88681894a4
# bad: [a01c9fe32378636ae65bec8047b5de3fdb2ba5c8] Merge tag 'nfsd-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
git bisect bad a01c9fe32378636ae65bec8047b5de3fdb2ba5c8
# bad: [691632f0e86973604678d193ccfa47b9614581aa] Merge tag 's390-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
git bisect bad 691632f0e86973604678d193ccfa47b9614581aa
# good: [8ede842f669b6f78812349bbef4d1efd0fbdafce] Merge tag 'rust-6.9' of https://github.com/Rust-for-Linux/linux
git bisect good 8ede842f669b6f78812349bbef4d1efd0fbdafce
# good: [bfdb395a7cde12d83a623949ed029b0ab38d765b] Merge tag 'x86_mtrr_for_v6.9_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good bfdb395a7cde12d83a623949ed029b0ab38d765b
# bad: [685d98211273f60e38a6d361b62d7016c545297e] Merge tag 'x86-core-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 685d98211273f60e38a6d361b62d7016c545297e
# good: [b0402403e54ae9eb94ce1cbb53c7def776e97426] Merge tag 'edac_updates_for_v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
git bisect good b0402403e54ae9eb94ce1cbb53c7def776e97426
# good: [cb81deefb59de01325ab822f900c13941bfaf67f] x86/idle: Sanitize X86_BUG_AMD_E400 handling
git bisect good cb81deefb59de01325ab822f900c13941bfaf67f
# good: [73f0d1d7b4abb4a46bae1a0d8caf66e23d1138d0] Merge tag 'x86-asm-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 73f0d1d7b4abb4a46bae1a0d8caf66e23d1138d0
# good: [65efc4dc12c5cc296374278673b89390eba79fe6] x86/cpu: Provide a declaration for itlb_multihit_kvm_mitigation
git bisect good 65efc4dc12c5cc296374278673b89390eba79fe6
# good: [d69ad12c786f0a4593c48c0658043aa4a5116b09] Merge tag 'x86-build-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good d69ad12c786f0a4593c48c0658043aa4a5116b09
# good: [35ce64922c8263448e58a2b9e8d15a64e11e9b2d] x86/idle: Select idle routine only once
git bisect good 35ce64922c8263448e58a2b9e8d15a64e11e9b2d
# good: [774a86f1c885460ade4334b901919fa1d8ae6ec6] x86/nmi: Drop unused declaration of proc_nmi_enabled()
git bisect good 774a86f1c885460ade4334b901919fa1d8ae6ec6
# bad: [fcc196579aa1fc167d6778948bff69fae6116737] Merge tag 'x86-cleanups-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad fcc196579aa1fc167d6778948bff69fae6116737
# first bad commit: [fcc196579aa1fc167d6778948bff69fae6116737] Merge tag 'x86-cleanups-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

---
2nd bisect:

# bad: [6d70929c7019e265425f7a89cf72163a639d462e] x86/nmi: Drop unused declaration of proc_nmi_enabled()
# good: [d69ad12c786f0a4593c48c0658043aa4a5116b09] Merge tag 'x86-build-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect start 'HEAD' 'fcc196579aa1fc167d6778948bff69fae6116737~1'
# good: [5c157d25918ef15454c2a9a9262f7b763d9c9add] x86/msr: Add missing __percpu annotations
git bisect good 5c157d25918ef15454c2a9a9262f7b763d9c9add
# bad: [f5a6b1e2d92d825a0f73ca11e960795da336d99e] x86/uaccess: Add missing __force to casts in __access_ok() and valid_user_address()
git bisect bad f5a6b1e2d92d825a0f73ca11e960795da336d99e
# bad: [68907233f6d26a214bb79f892db8d999b15dda97] x86/percpu: Cure per CPU madness on UP
git bisect bad 68907233f6d26a214bb79f892db8d999b15dda97
# good: [053df18ceb928c8631042317a884b2842a457f1b] smp: Consolidate smp_prepare_boot_cpu()
git bisect good 053df18ceb928c8631042317a884b2842a457f1b
# first bad commit: [68907233f6d26a214bb79f892db8d999b15dda97] x86/percpu: Cure per CPU madness on UP

  reply	other threads:[~2024-03-15 16:17 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-04 10:12 [patch 0/9] x86: Cure tons of sparse warnings (mostly __percpu) Thomas Gleixner
2024-03-04 10:12 ` [patch 1/9] perf/x86/amd/uncore: Fix __percpu annotation Thomas Gleixner
2024-03-04 10:12 ` [patch 2/9] x86/msr: Prepare for including percpu.h Thomas Gleixner
2024-03-04 10:12 ` [patch 3/9] x86/msr: Add missing __percpu annotations Thomas Gleixner
2024-03-04 10:12 ` [patch 4/9] smp: Consolidate smp_prepare_boot_cpu() Thomas Gleixner
2024-03-04 10:12 ` [patch 5/9] x86: Cure per CPU madness on UP Thomas Gleixner
2024-03-15 16:17   ` Guenter Roeck [this message]
2024-03-15 16:42     ` Linus Torvalds
2024-03-15 17:02       ` Guenter Roeck
2024-03-15 17:40       ` Thomas Gleixner
2024-03-15 22:55         ` Thomas Gleixner
2024-03-15 23:23           ` Linus Torvalds
2024-03-16  1:11             ` Thomas Gleixner
2024-03-16  1:23               ` Linus Torvalds
2024-03-16 21:34                 ` Arnd Bergmann
2024-03-17 21:03               ` David Laight
2024-03-18 11:11               ` Thomas Gleixner
2024-03-18 17:27               ` Thomas Gleixner
2024-03-18 19:13                 ` Arnd Bergmann
2024-03-19 16:21                   ` Thomas Gleixner
2024-03-19 18:26                     ` Guenter Roeck
2024-03-16  0:56           ` Guenter Roeck
2024-03-20  8:58     ` Thomas Gleixner
2024-03-20 15:46       ` Guenter Roeck
2024-03-21 11:14         ` Thomas Gleixner
2024-03-21 14:06           ` Guenter Roeck
2024-03-21 16:49             ` Thomas Gleixner
2024-03-04 10:12 ` [patch 6/9] x86/uaccess: Add missing __force to casts in __access_ok() and valid_user_address() Thomas Gleixner
2024-03-04 10:12 ` [patch 7/9] x86/cpu: Use EXPORT_PER_CPU_SYMBOL_GPL() for x86_spec_ctrl_current Thomas Gleixner
2024-03-04 10:12 ` [patch 8/9] x86/cpu: Provide a declaration for itlb_multihit_kvm_mitigation Thomas Gleixner
2024-03-04 10:12 ` [patch 9/9] x86/callthunks: Use EXPORT_PER_CPU_SYMBOL_GPL() for per CPU variables Thomas Gleixner
2024-03-04 11:08 ` [patch 0/9] x86: Cure tons of sparse warnings (mostly __percpu) Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e20d88d0-5fb9-4307-be67-88b04ae9a188@roeck-us.net \
    --to=linux@roeck-us.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sparse@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=oe-kbuild-all@lists.linux.dev \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linuxfoundation.org \
    --cc=ubizjak@gmail.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).