linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch 0/9] x86: Cure tons of sparse warnings (mostly __percpu)
@ 2024-03-04 10:12 Thomas Gleixner
  2024-03-04 10:12 ` [patch 1/9] perf/x86/amd/uncore: Fix __percpu annotation Thomas Gleixner
                   ` (9 more replies)
  0 siblings, 10 replies; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-04 10:12 UTC (permalink / raw)
  To: LKML; +Cc: x86, Linus Torvalds, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

A recent 0-day report about new __percpu related sparse warnings made me
look deeper into it after I dismissed the report as bogus initially.

It turned out that sparse is actually right and all of these warnings (not
only the most recent ones) are valid and got ignored. Some of them for many
years.

The worst offender is an UP build because that maps the per CPU cpu_info to
boot_cpu_data, which is regular data.

As a consequence all per CPU accessors which look like legit code and are
legit code in the SMP build are causing sparse to emit warnings.

This series addresses this by:

     - Adding the missing __percpu annotations all over the place

     - Curing the UP madness by exposing a proper per CPU cpu_info for the
       price of wasting 320 byte of memory.

       Even if the size police will hate me for that, this cures most of
       the madness in one go and avoids to add more hideous macro mess
       similar to the completely bogus cpu_data() one which should have
       never been there in the first place.

       I know that there are people who think that size matters, but the
       only things which really matter in software are correctness and
       maintainability. The latter simply forbids to add more hideous macro
       mess just to avoid wasting 320 bytes of memory for something which
       is mostly a reminiscence of the good old days...
       
     - Fixing a few obvious non __percpu related warnings which stood out
       prominently.

That reduces the sparse warnings in arch/x86 significantly. The remaining
ones are less trivial to address:

     - The non-x86 specific warning vs. sighand::lock:

       sparse: warning: incorrect type in argument 1 (different address spaces)
       sparse:    expected struct lockdep_map const *lock
       sparse:    got struct lockdep_map [noderef] __rcu *

     - A bunch of lock scope false positives which are non-trivial to solve

     - A gazillion of __iomem warnings with the vast majority in the HPE/UV
       code which are _all_ legit because neither UV nor the other places
       care about the name space annotations at all. Pointer is pointer
       after all.

     - Tons of truncation warnings like this:

       sparse: warning: cast truncates bits from constant value (20002 becomes 2)

       mostly in the hypervisor space (kvm, hyperv). I did not look at
       those at all so I don't know whether they matter or not.

I really think sparse is valuable, but all of us should spend more time on
this to weed out false positives or at least have some filtering of things
which are simply not solvable at the sparse level.

Coming back to __percpu. As I mentioned in the original thread it's a sad
state of affairs that the only way to detect the __percpu fails is sparse
or some other static checker:

      https://lore.kernel.org/all/87bk7vuldh.ffs@tglx

But that's a different problem to solve and does not invalidate the fixes
which come with this series in any way.

If the compiler people would have provided a way to utilize the anyway
non-standard name space support in a useful way, I could have spared the
time to bang my head agaist the wall simply because this would have failed
to build in the first place long ago. That just makes me sad.

After wading through this, I really ask the 0-day people to push hard on
any sparse fallout which involves __percpu. The resulting failures can be
truly subtle and not necessarily fatal right away.

The series is based on Linus tree and also available from git:

   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/core

Thanks,

	tglx
---
 arch/alpha/kernel/smp.c              |    5 -----
 arch/arc/kernel/smp.c                |    5 -----
 arch/csky/kernel/smp.c               |    4 ----
 arch/hexagon/kernel/smp.c            |    4 ----
 arch/openrisc/kernel/smp.c           |    4 ----
 arch/riscv/kernel/smpboot.c          |    4 ----
 arch/sparc/kernel/smp_64.c           |    4 ----
 arch/x86/events/amd/uncore.c         |    2 +-
 arch/x86/events/intel/core.c         |    1 +
 arch/x86/events/intel/ds.c           |    1 +
 arch/x86/include/asm/debugreg.h      |   24 ++++++++++++++++++++++++
 arch/x86/include/asm/fsgsbase.h      |    2 +-
 arch/x86/include/asm/msr.h           |   26 ++++++++++++++------------
 arch/x86/include/asm/processor.h     |   28 ----------------------------
 arch/x86/include/asm/smp.h           |    5 -----
 arch/x86/include/asm/spec-ctrl.h     |    2 ++
 arch/x86/include/asm/special_insns.h |    4 ++--
 arch/x86/include/asm/tsc.h           |    3 ++-
 arch/x86/include/asm/uaccess_64.h    |    7 ++++---
 arch/x86/kernel/callthunks.c         |    4 ++--
 arch/x86/kernel/cpu/bugs.c           |    2 +-
 arch/x86/kernel/cpu/common.c         |    3 +++
 arch/x86/kernel/cpu/intel_pconfig.c  |    2 ++
 arch/x86/kernel/cpu/rdrand.c         |    1 +
 arch/x86/kernel/fpu/bugs.c           |    2 ++
 arch/x86/kernel/setup.c              |   10 ++++++++++
 arch/x86/kernel/smpboot.c            |    9 +++++----
 arch/x86/kernel/step.c               |    2 ++
 arch/x86/kvm/mmu/mmu.c               |    3 +--
 arch/x86/lib/msr-smp.c               |   12 +++++-------
 arch/x86/lib/msr.c                   |    6 +++---
 include/linux/smp.h                  |   13 ++++++-------
 init/main.c                          |    4 ++++
 33 files changed, 99 insertions(+), 109 deletions(-)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [patch 1/9] perf/x86/amd/uncore: Fix __percpu annotation
  2024-03-04 10:12 [patch 0/9] x86: Cure tons of sparse warnings (mostly __percpu) Thomas Gleixner
@ 2024-03-04 10:12 ` Thomas Gleixner
  2024-03-04 12:15   ` [tip: x86/cleanups] " tip-bot2 for Thomas Gleixner
  2024-03-04 10:12 ` [patch 2/9] x86/msr: Prepare for including percpu.h Thomas Gleixner
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-04 10:12 UTC (permalink / raw)
  To: LKML; +Cc: x86, Linus Torvalds, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

The __percpu annotation in struct amd_uncore is confusing sparse:

uncore.c:649:10: sparse: warning: incorrect type in initializer (different address spaces)
uncore.c:649:10: sparse:    expected void const [noderef] __percpu *__vpp_verify
uncore.c:649:10: sparse:    got union amd_uncore_info *

The reason is that the __percpu annotation sits between the '*'
dereferencing operator and the member name.

Move it before the dereferencing operator to cure this.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/events/amd/uncore.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/events/amd/uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -71,7 +71,7 @@ union amd_uncore_info {
 };
 
 struct amd_uncore {
-	union amd_uncore_info * __percpu info;
+	union amd_uncore_info  __percpu *info;
 	struct amd_uncore_pmu *pmus;
 	unsigned int num_pmus;
 	bool init_done;


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [patch 2/9] x86/msr: Prepare for including percpu.h
  2024-03-04 10:12 [patch 0/9] x86: Cure tons of sparse warnings (mostly __percpu) Thomas Gleixner
  2024-03-04 10:12 ` [patch 1/9] perf/x86/amd/uncore: Fix __percpu annotation Thomas Gleixner
@ 2024-03-04 10:12 ` Thomas Gleixner
  2024-03-04 12:15   ` [tip: x86/cleanups] x86/msr: Prepare for including <linux/percpu.h> into <asm/msr.h> tip-bot2 for Thomas Gleixner
  2024-03-04 10:12 ` [patch 3/9] x86/msr: Add missing __percpu annotations Thomas Gleixner
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-04 10:12 UTC (permalink / raw)
  To: LKML; +Cc: x86, Linus Torvalds, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

To cleanup the per CPU insanity of UP which causes sparse to be rightfully
unhappy and prevents the usage of the generic per cpu accessors on cpu_info
it is necessary to include linux/percpu.h into asm/msr.h.

Including percpu.h into msr.h is impossible because it ends up in header
dependency hell. The problem is that processor.h includes msr.h. The
inclusion of percpu.h results in a compile fail where the compiler cannot
longer handle an include in cpufeature.h which references boot_cpu_data
which is defined in processor.h

The only reason why msr.h is included in processor.h are the
set/get_debugctlmsr() inlines. They are defined there because processor.h
is such a nice dump ground for everything. In fact they belong obviously
into debugreg.h.

Move them to debugreg.h and fixup the resulting damage which is just
exposing the reliance on random include chains.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/events/intel/core.c         |    1 +
 arch/x86/events/intel/ds.c           |    1 +
 arch/x86/include/asm/debugreg.h      |   24 ++++++++++++++++++++++++
 arch/x86/include/asm/fsgsbase.h      |    2 +-
 arch/x86/include/asm/processor.h     |   22 ----------------------
 arch/x86/include/asm/special_insns.h |    4 ++--
 arch/x86/kernel/cpu/intel_pconfig.c  |    2 ++
 arch/x86/kernel/cpu/rdrand.c         |    1 +
 arch/x86/kernel/fpu/bugs.c           |    2 ++
 arch/x86/kernel/step.c               |    2 ++
 10 files changed, 36 insertions(+), 25 deletions(-)

--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -17,6 +17,7 @@
 #include <linux/kvm_host.h>
 
 #include <asm/cpufeature.h>
+#include <asm/debugreg.h>
 #include <asm/hardirq.h>
 #include <asm/intel-family.h>
 #include <asm/intel_pt.h>
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -5,6 +5,7 @@
 #include <linux/sched/clock.h>
 
 #include <asm/cpu_entry_area.h>
+#include <asm/debugreg.h>
 #include <asm/perf_event.h>
 #include <asm/tlbflush.h>
 #include <asm/insn.h>
--- a/arch/x86/include/asm/debugreg.h
+++ b/arch/x86/include/asm/debugreg.h
@@ -5,7 +5,9 @@
 #include <linux/bug.h>
 #include <linux/percpu.h>
 #include <uapi/asm/debugreg.h>
+
 #include <asm/cpufeature.h>
+#include <asm/msr.h>
 
 DECLARE_PER_CPU(unsigned long, cpu_dr7);
 
@@ -159,4 +161,26 @@ static inline unsigned long amd_get_dr_a
 }
 #endif
 
+static inline unsigned long get_debugctlmsr(void)
+{
+	unsigned long debugctlmsr = 0;
+
+#ifndef CONFIG_X86_DEBUGCTLMSR
+	if (boot_cpu_data.x86 < 6)
+		return 0;
+#endif
+	rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctlmsr);
+
+	return debugctlmsr;
+}
+
+static inline void update_debugctlmsr(unsigned long debugctlmsr)
+{
+#ifndef CONFIG_X86_DEBUGCTLMSR
+	if (boot_cpu_data.x86 < 6)
+		return;
+#endif
+	wrmsrl(MSR_IA32_DEBUGCTLMSR, debugctlmsr);
+}
+
 #endif /* _ASM_X86_DEBUGREG_H */
--- a/arch/x86/include/asm/fsgsbase.h
+++ b/arch/x86/include/asm/fsgsbase.h
@@ -6,7 +6,7 @@
 
 #ifdef CONFIG_X86_64
 
-#include <asm/msr-index.h>
+#include <asm/msr.h>
 
 /*
  * Read/write a task's FSBASE or GSBASE. This returns the value that
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -578,28 +578,6 @@ extern void cpu_init(void);
 extern void cpu_init_exception_handling(void);
 extern void cr4_init(void);
 
-static inline unsigned long get_debugctlmsr(void)
-{
-	unsigned long debugctlmsr = 0;
-
-#ifndef CONFIG_X86_DEBUGCTLMSR
-	if (boot_cpu_data.x86 < 6)
-		return 0;
-#endif
-	rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctlmsr);
-
-	return debugctlmsr;
-}
-
-static inline void update_debugctlmsr(unsigned long debugctlmsr)
-{
-#ifndef CONFIG_X86_DEBUGCTLMSR
-	if (boot_cpu_data.x86 < 6)
-		return;
-#endif
-	wrmsrl(MSR_IA32_DEBUGCTLMSR, debugctlmsr);
-}
-
 extern void set_task_blockstep(struct task_struct *task, bool on);
 
 /* Boot loader type from the setup header: */
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -2,11 +2,11 @@
 #ifndef _ASM_X86_SPECIAL_INSNS_H
 #define _ASM_X86_SPECIAL_INSNS_H
 
-
 #ifdef __KERNEL__
-
 #include <asm/nops.h>
 #include <asm/processor-flags.h>
+
+#include <linux/errno.h>
 #include <linux/irqflags.h>
 #include <linux/jump_label.h>
 
--- a/arch/x86/kernel/cpu/intel_pconfig.c
+++ b/arch/x86/kernel/cpu/intel_pconfig.c
@@ -7,6 +7,8 @@
  * Author:
  *	Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
  */
+#include <linux/bug.h>
+#include <linux/limits.h>
 
 #include <asm/cpufeature.h>
 #include <asm/intel_pconfig.h>
--- a/arch/x86/kernel/cpu/rdrand.c
+++ b/arch/x86/kernel/cpu/rdrand.c
@@ -6,6 +6,7 @@
  * Authors: Fenghua Yu <fenghua.yu@intel.com>,
  *          H. Peter Anvin <hpa@linux.intel.com>
  */
+#include <linux/printk.h>
 
 #include <asm/processor.h>
 #include <asm/archrandom.h>
--- a/arch/x86/kernel/fpu/bugs.c
+++ b/arch/x86/kernel/fpu/bugs.c
@@ -2,6 +2,8 @@
 /*
  * x86 FPU bug checks:
  */
+#include <linux/printk.h>
+
 #include <asm/cpufeature.h>
 #include <asm/fpu/api.h>
 
--- a/arch/x86/kernel/step.c
+++ b/arch/x86/kernel/step.c
@@ -6,7 +6,9 @@
 #include <linux/sched/task_stack.h>
 #include <linux/mm.h>
 #include <linux/ptrace.h>
+
 #include <asm/desc.h>
+#include <asm/debugreg.h>
 #include <asm/mmu_context.h>
 
 unsigned long convert_ip_to_linear(struct task_struct *child, struct pt_regs *regs)


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [patch 3/9] x86/msr: Add missing __percpu annotations
  2024-03-04 10:12 [patch 0/9] x86: Cure tons of sparse warnings (mostly __percpu) Thomas Gleixner
  2024-03-04 10:12 ` [patch 1/9] perf/x86/amd/uncore: Fix __percpu annotation Thomas Gleixner
  2024-03-04 10:12 ` [patch 2/9] x86/msr: Prepare for including percpu.h Thomas Gleixner
@ 2024-03-04 10:12 ` Thomas Gleixner
  2024-03-04 12:15   ` [tip: x86/cleanups] " tip-bot2 for Thomas Gleixner
  2024-03-04 10:12 ` [patch 4/9] smp: Consolidate smp_prepare_boot_cpu() Thomas Gleixner
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-04 10:12 UTC (permalink / raw)
  To: LKML; +Cc: x86, Linus Torvalds, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

sparse complains rightfully about using a plain pointer for per CPU
accessors:

msr-smp.c:15:23: sparse: warning: incorrect type in initializer (different address spaces)
msr-smp.c:15:23: sparse:    expected void const [noderef] __percpu *__vpp_verify
msr-smp.c:15:23: sparse:    got struct msr *

Add __percpu annotations to the related datastructure and function
arguments to cure this. This also cures the related sparse warnings at the
callsites in drivers/edac/amd64_edac.c.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/msr.h       |   26 ++++++++++++++------------
 arch/x86/include/asm/processor.h |    1 -
 arch/x86/include/asm/tsc.h       |    3 ++-
 arch/x86/lib/msr-smp.c           |   12 +++++-------
 arch/x86/lib/msr.c               |    6 +++---
 5 files changed, 24 insertions(+), 24 deletions(-)

--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -12,11 +12,13 @@
 #include <uapi/asm/msr.h>
 #include <asm/shared/msr.h>
 
+#include <linux/percpu.h>
+
 struct msr_info {
-	u32 msr_no;
-	struct msr reg;
-	struct msr *msrs;
-	int err;
+	u32			msr_no;
+	struct msr		reg;
+	struct msr __percpu	*msrs;
+	int			err;
 };
 
 struct msr_regs_info {
@@ -323,8 +325,8 @@ static inline int wrmsrl_safe(u32 msr, u
 	return wrmsr_safe(msr, (u32)val,  (u32)(val >> 32));
 }
 
-struct msr *msrs_alloc(void);
-void msrs_free(struct msr *msrs);
+struct msr __percpu *msrs_alloc(void);
+void msrs_free(struct msr __percpu *msrs);
 int msr_set_bit(u32 msr, u8 bit);
 int msr_clear_bit(u32 msr, u8 bit);
 
@@ -333,8 +335,8 @@ int rdmsr_on_cpu(unsigned int cpu, u32 m
 int wrmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h);
 int rdmsrl_on_cpu(unsigned int cpu, u32 msr_no, u64 *q);
 int wrmsrl_on_cpu(unsigned int cpu, u32 msr_no, u64 q);
-void rdmsr_on_cpus(const struct cpumask *mask, u32 msr_no, struct msr *msrs);
-void wrmsr_on_cpus(const struct cpumask *mask, u32 msr_no, struct msr *msrs);
+void rdmsr_on_cpus(const struct cpumask *mask, u32 msr_no, struct msr __percpu *msrs);
+void wrmsr_on_cpus(const struct cpumask *mask, u32 msr_no, struct msr __percpu *msrs);
 int rdmsr_safe_on_cpu(unsigned int cpu, u32 msr_no, u32 *l, u32 *h);
 int wrmsr_safe_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h);
 int rdmsrl_safe_on_cpu(unsigned int cpu, u32 msr_no, u64 *q);
@@ -363,14 +365,14 @@ static inline int wrmsrl_on_cpu(unsigned
 	return 0;
 }
 static inline void rdmsr_on_cpus(const struct cpumask *m, u32 msr_no,
-				struct msr *msrs)
+				struct msr __percpu *msrs)
 {
-	rdmsr_on_cpu(0, msr_no, &(msrs[0].l), &(msrs[0].h));
+	rdmsr_on_cpu(0, msr_no, raw_cpu_ptr(&msrs->l), raw_cpu_ptr(&msrs->h));
 }
 static inline void wrmsr_on_cpus(const struct cpumask *m, u32 msr_no,
-				struct msr *msrs)
+				struct msr __percpu *msrs)
 {
-	wrmsr_on_cpu(0, msr_no, msrs[0].l, msrs[0].h);
+	wrmsr_on_cpu(0, msr_no, raw_cpu_read(msrs->l), raw_cpu_read(msrs->h));
 }
 static inline int rdmsr_safe_on_cpu(unsigned int cpu, u32 msr_no,
 				    u32 *l, u32 *h)
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -20,7 +20,6 @@ struct vm86;
 #include <asm/page.h>
 #include <asm/pgtable_types.h>
 #include <asm/percpu.h>
-#include <asm/msr.h>
 #include <asm/desc_defs.h>
 #include <asm/nops.h>
 #include <asm/special_insns.h>
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -5,8 +5,9 @@
 #ifndef _ASM_X86_TSC_H
 #define _ASM_X86_TSC_H
 
-#include <asm/processor.h>
 #include <asm/cpufeature.h>
+#include <asm/processor.h>
+#include <asm/msr.h>
 
 /*
  * Standard way to access the cycle counter.
--- a/arch/x86/lib/msr-smp.c
+++ b/arch/x86/lib/msr-smp.c
@@ -9,10 +9,9 @@ static void __rdmsr_on_cpu(void *info)
 {
 	struct msr_info *rv = info;
 	struct msr *reg;
-	int this_cpu = raw_smp_processor_id();
 
 	if (rv->msrs)
-		reg = per_cpu_ptr(rv->msrs, this_cpu);
+		reg = this_cpu_ptr(rv->msrs);
 	else
 		reg = &rv->reg;
 
@@ -23,10 +22,9 @@ static void __wrmsr_on_cpu(void *info)
 {
 	struct msr_info *rv = info;
 	struct msr *reg;
-	int this_cpu = raw_smp_processor_id();
 
 	if (rv->msrs)
-		reg = per_cpu_ptr(rv->msrs, this_cpu);
+		reg = this_cpu_ptr(rv->msrs);
 	else
 		reg = &rv->reg;
 
@@ -97,7 +95,7 @@ int wrmsrl_on_cpu(unsigned int cpu, u32
 EXPORT_SYMBOL(wrmsrl_on_cpu);
 
 static void __rwmsr_on_cpus(const struct cpumask *mask, u32 msr_no,
-			    struct msr *msrs,
+			    struct msr __percpu *msrs,
 			    void (*msr_func) (void *info))
 {
 	struct msr_info rv;
@@ -124,7 +122,7 @@ static void __rwmsr_on_cpus(const struct
  * @msrs:       array of MSR values
  *
  */
-void rdmsr_on_cpus(const struct cpumask *mask, u32 msr_no, struct msr *msrs)
+void rdmsr_on_cpus(const struct cpumask *mask, u32 msr_no, struct msr __percpu *msrs)
 {
 	__rwmsr_on_cpus(mask, msr_no, msrs, __rdmsr_on_cpu);
 }
@@ -138,7 +136,7 @@ EXPORT_SYMBOL(rdmsr_on_cpus);
  * @msrs:       array of MSR values
  *
  */
-void wrmsr_on_cpus(const struct cpumask *mask, u32 msr_no, struct msr *msrs)
+void wrmsr_on_cpus(const struct cpumask *mask, u32 msr_no, struct msr __percpu *msrs)
 {
 	__rwmsr_on_cpus(mask, msr_no, msrs, __wrmsr_on_cpu);
 }
--- a/arch/x86/lib/msr.c
+++ b/arch/x86/lib/msr.c
@@ -6,9 +6,9 @@
 #define CREATE_TRACE_POINTS
 #include <asm/msr-trace.h>
 
-struct msr *msrs_alloc(void)
+struct msr __percpu *msrs_alloc(void)
 {
-	struct msr *msrs = NULL;
+	struct msr __percpu *msrs = NULL;
 
 	msrs = alloc_percpu(struct msr);
 	if (!msrs) {
@@ -20,7 +20,7 @@ struct msr *msrs_alloc(void)
 }
 EXPORT_SYMBOL(msrs_alloc);
 
-void msrs_free(struct msr *msrs)
+void msrs_free(struct msr __percpu *msrs)
 {
 	free_percpu(msrs);
 }


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [patch 4/9] smp: Consolidate smp_prepare_boot_cpu()
  2024-03-04 10:12 [patch 0/9] x86: Cure tons of sparse warnings (mostly __percpu) Thomas Gleixner
                   ` (2 preceding siblings ...)
  2024-03-04 10:12 ` [patch 3/9] x86/msr: Add missing __percpu annotations Thomas Gleixner
@ 2024-03-04 10:12 ` Thomas Gleixner
  2024-03-04 12:15   ` [tip: x86/cleanups] " tip-bot2 for Thomas Gleixner
  2024-03-04 10:12 ` [patch 5/9] x86: Cure per CPU madness on UP Thomas Gleixner
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-04 10:12 UTC (permalink / raw)
  To: LKML; +Cc: x86, Linus Torvalds, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

There is no point in having seven architectures implementing the same empty
stub.

Provide a weak function in the init code and remove the stubs.

This also allows to utilize the function on UP which is required to
sanitize the percpu handling on X86 UP.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/alpha/kernel/smp.c     |    5 -----
 arch/arc/kernel/smp.c       |    5 -----
 arch/csky/kernel/smp.c      |    4 ----
 arch/hexagon/kernel/smp.c   |    4 ----
 arch/openrisc/kernel/smp.c  |    4 ----
 arch/riscv/kernel/smpboot.c |    4 ----
 arch/sparc/kernel/smp_64.c  |    4 ----
 arch/x86/include/asm/smp.h  |    5 -----
 arch/x86/kernel/smpboot.c   |    5 +++++
 include/linux/smp.h         |   13 ++++++-------
 init/main.c                 |    4 ++++
 11 files changed, 15 insertions(+), 42 deletions(-)

--- a/arch/alpha/kernel/smp.c
+++ b/arch/alpha/kernel/smp.c
@@ -467,11 +467,6 @@ smp_prepare_cpus(unsigned int max_cpus)
 	smp_num_cpus = smp_num_probed;
 }
 
-void
-smp_prepare_boot_cpu(void)
-{
-}
-
 int
 __cpu_up(unsigned int cpu, struct task_struct *tidle)
 {
--- a/arch/arc/kernel/smp.c
+++ b/arch/arc/kernel/smp.c
@@ -39,11 +39,6 @@ struct plat_smp_ops  __weak plat_smp_ops
 /* XXX: per cpu ? Only needed once in early secondary boot */
 struct task_struct *secondary_idle_tsk;
 
-/* Called from start_kernel */
-void __init smp_prepare_boot_cpu(void)
-{
-}
-
 static int __init arc_get_cpu_map(const char *name, struct cpumask *cpumask)
 {
 	unsigned long dt_root = of_get_flat_dt_root();
--- a/arch/csky/kernel/smp.c
+++ b/arch/csky/kernel/smp.c
@@ -152,10 +152,6 @@ void arch_irq_work_raise(void)
 }
 #endif
 
-void __init smp_prepare_boot_cpu(void)
-{
-}
-
 void __init smp_prepare_cpus(unsigned int max_cpus)
 {
 }
--- a/arch/hexagon/kernel/smp.c
+++ b/arch/hexagon/kernel/smp.c
@@ -114,10 +114,6 @@ void send_ipi(const struct cpumask *cpum
 	local_irq_restore(flags);
 }
 
-void __init smp_prepare_boot_cpu(void)
-{
-}
-
 /*
  * interrupts should already be disabled from the VM
  * SP should already be correct; need to set THREADINFO_REG
--- a/arch/openrisc/kernel/smp.c
+++ b/arch/openrisc/kernel/smp.c
@@ -57,10 +57,6 @@ static void boot_secondary(unsigned int
 	spin_unlock(&boot_lock);
 }
 
-void __init smp_prepare_boot_cpu(void)
-{
-}
-
 void __init smp_init_cpus(void)
 {
 	struct device_node *cpu;
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -42,10 +42,6 @@
 
 static DECLARE_COMPLETION(cpu_running);
 
-void __init smp_prepare_boot_cpu(void)
-{
-}
-
 void __init smp_prepare_cpus(unsigned int max_cpus)
 {
 	int cpuid;
--- a/arch/sparc/kernel/smp_64.c
+++ b/arch/sparc/kernel/smp_64.c
@@ -1206,10 +1206,6 @@ void __init smp_prepare_cpus(unsigned in
 {
 }
 
-void smp_prepare_boot_cpu(void)
-{
-}
-
 void __init smp_setup_processor_id(void)
 {
 	if (tlb_type == spitfire)
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -59,11 +59,6 @@ static inline void stop_other_cpus(void)
 	smp_ops.stop_other_cpus(1);
 }
 
-static inline void smp_prepare_boot_cpu(void)
-{
-	smp_ops.smp_prepare_boot_cpu();
-}
-
 static inline void smp_prepare_cpus(unsigned int max_cpus)
 {
 	smp_ops.smp_prepare_cpus(max_cpus);
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1187,6 +1187,11 @@ void __init smp_prepare_cpus_common(void
 	set_cpu_sibling_map(0);
 }
 
+void __init smp_prepare_boot_cpu(void)
+{
+	smp_ops.smp_prepare_boot_cpu();
+}
+
 #ifdef CONFIG_X86_64
 /* Establish whether parallel bringup can be supported. */
 bool __init arch_cpuhp_init_parallel_bringup(void)
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -105,6 +105,12 @@ static inline void on_each_cpu_cond(smp_
 	on_each_cpu_cond_mask(cond_func, func, info, wait, cpu_online_mask);
 }
 
+/*
+ * Architecture specific boot CPU setup.  Defined as empty weak function in
+ * init/main.c. Architectures can override it.
+ */
+void smp_prepare_boot_cpu(void);
+
 #ifdef CONFIG_SMP
 
 #include <linux/preempt.h>
@@ -171,12 +177,6 @@ void generic_smp_call_function_single_in
 #define generic_smp_call_function_interrupt \
 	generic_smp_call_function_single_interrupt
 
-/*
- * Mark the boot cpu "online" so that it can call console drivers in
- * printk() and can access its per-cpu storage.
- */
-void smp_prepare_boot_cpu(void);
-
 extern unsigned int setup_max_cpus;
 extern void __init setup_nr_cpu_ids(void);
 extern void __init smp_init(void);
@@ -203,7 +203,6 @@ static inline void up_smp_call_function(
 			(up_smp_call_function(func, info))
 
 static inline void smp_send_reschedule(int cpu) { }
-#define smp_prepare_boot_cpu()			do {} while (0)
 #define smp_call_function_many(mask, func, info, wait) \
 			(up_smp_call_function(func, info))
 static inline void call_function_init(void) { }
--- a/init/main.c
+++ b/init/main.c
@@ -776,6 +776,10 @@ void __init __weak smp_setup_processor_i
 {
 }
 
+void __init __weak smp_prepare_boot_cpu(void)
+{
+}
+
 # if THREAD_SIZE >= PAGE_SIZE
 void __init __weak thread_stack_cache_init(void)
 {


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-04 10:12 [patch 0/9] x86: Cure tons of sparse warnings (mostly __percpu) Thomas Gleixner
                   ` (3 preceding siblings ...)
  2024-03-04 10:12 ` [patch 4/9] smp: Consolidate smp_prepare_boot_cpu() Thomas Gleixner
@ 2024-03-04 10:12 ` Thomas Gleixner
  2024-03-04 12:15   ` [tip: x86/cleanups] x86/percpu: " tip-bot2 for Thomas Gleixner
  2024-03-15 16:17   ` [patch 5/9] x86: " Guenter Roeck
  2024-03-04 10:12 ` [patch 6/9] x86/uaccess: Add missing __force to casts in __access_ok() and valid_user_address() Thomas Gleixner
                   ` (4 subsequent siblings)
  9 siblings, 2 replies; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-04 10:12 UTC (permalink / raw)
  To: LKML; +Cc: x86, Linus Torvalds, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

On UP builds sparse complains rightfully about accesses to cpu_info with
per CPU accessors:

cacheinfo.c:282:30: sparse: warning: incorrect type in initializer (different address spaces)
cacheinfo.c:282:30: sparse:    expected void const [noderef] __percpu *__vpp_verify
cacheinfo.c:282:30: sparse:    got unsigned int *

The reason is that on UP builds cpu_info which is a per CPU variable on SMP
is mapped to boot_cpu_info which is a regular variable. There is a hideous
accessor cpu_data() which tries to hide this, but it's not sufficient as
some places require raw accessors and generates worse code than the regular
per CPU accessors.

Waste sizeof(struct x86_cpuinfo) memory on UP and provide the per CPU
cpu_info unconditionally. This requires to update the CPU info on the boot
CPU as SMP does. (Ab)use the weakly defined smp_prepare_boot_cpu() function
and implement exactly that.

This allows to use regular per CPU accessors uncoditionally and paves the
way to remove the cpu_data() hackery.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/processor.h |    5 -----
 arch/x86/kernel/cpu/common.c     |    3 +++
 arch/x86/kernel/setup.c          |   10 ++++++++++
 arch/x86/kernel/smpboot.c        |    4 ----
 4 files changed, 13 insertions(+), 9 deletions(-)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -185,13 +185,8 @@ extern struct cpuinfo_x86	new_cpu_data;
 extern __u32			cpu_caps_cleared[NCAPINTS + NBUGINTS];
 extern __u32			cpu_caps_set[NCAPINTS + NBUGINTS];
 
-#ifdef CONFIG_SMP
 DECLARE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
 #define cpu_data(cpu)		per_cpu(cpu_info, cpu)
-#else
-#define cpu_info		boot_cpu_data
-#define cpu_data(cpu)		boot_cpu_data
-#endif
 
 extern const struct seq_operations cpuinfo_op;
 
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -70,6 +70,9 @@
 
 #include "cpu.h"
 
+DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
+EXPORT_PER_CPU_SYMBOL(cpu_info);
+
 u32 elf_hwcap2 __read_mostly;
 
 /* Number of siblings per CPU package */
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1211,6 +1211,16 @@ void __init i386_reserve_resources(void)
 
 #endif /* CONFIG_X86_32 */
 
+#ifndef CONFIG_SMP
+void __init smp_prepare_boot_cpu(void)
+{
+	struct cpuinfo_x86 *c = &cpu_data(0);
+
+	*c = boot_cpu_data;
+	c->initialized = true;
+}
+#endif
+
 static struct notifier_block kernel_offset_notifier = {
 	.notifier_call = dump_kernel_offset
 };
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -101,10 +101,6 @@ EXPORT_PER_CPU_SYMBOL(cpu_core_map);
 DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_die_map);
 EXPORT_PER_CPU_SYMBOL(cpu_die_map);
 
-/* Per CPU bogomips and other parameters */
-DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
-EXPORT_PER_CPU_SYMBOL(cpu_info);
-
 /* CPUs which are the primary SMT threads */
 struct cpumask __cpu_primary_thread_mask __read_mostly;
 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [patch 6/9] x86/uaccess: Add missing __force to casts in __access_ok() and valid_user_address()
  2024-03-04 10:12 [patch 0/9] x86: Cure tons of sparse warnings (mostly __percpu) Thomas Gleixner
                   ` (4 preceding siblings ...)
  2024-03-04 10:12 ` [patch 5/9] x86: Cure per CPU madness on UP Thomas Gleixner
@ 2024-03-04 10:12 ` Thomas Gleixner
  2024-03-04 12:15   ` [tip: x86/cleanups] " tip-bot2 for Thomas Gleixner
  2024-03-04 10:12 ` [patch 7/9] x86/cpu: Use EXPORT_PER_CPU_SYMBOL_GPL() for x86_spec_ctrl_current Thomas Gleixner
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-04 10:12 UTC (permalink / raw)
  To: LKML; +Cc: x86, Linus Torvalds, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

sparse complains about losing the __user address space due to the cast to
long:

uaccess_64.h:88:24: sparse: warning: cast removes address space '__user' of expression

Annotate it with __force to tell sparse that this is intentional.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/uaccess_64.h |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -54,7 +54,7 @@ static inline unsigned long __untagged_a
  * half and a user half.  When cast to a signed type, user pointers
  * are positive and kernel pointers are negative.
  */
-#define valid_user_address(x) ((long)(x) >= 0)
+#define valid_user_address(x) ((__force long)(x) >= 0)
 
 /*
  * User pointers can have tag bits on x86-64.  This scheme tolerates
@@ -87,8 +87,9 @@ static inline bool __access_ok(const voi
 	if (__builtin_constant_p(size <= PAGE_SIZE) && size <= PAGE_SIZE) {
 		return valid_user_address(ptr);
 	} else {
-		unsigned long sum = size + (unsigned long)ptr;
-		return valid_user_address(sum) && sum >= (unsigned long)ptr;
+		unsigned long sum = size + (__force unsigned long)ptr;
+
+		return valid_user_address(sum) && sum >= (__force unsigned long)ptr;
 	}
 }
 #define __access_ok __access_ok


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [patch 7/9] x86/cpu: Use EXPORT_PER_CPU_SYMBOL_GPL() for x86_spec_ctrl_current
  2024-03-04 10:12 [patch 0/9] x86: Cure tons of sparse warnings (mostly __percpu) Thomas Gleixner
                   ` (5 preceding siblings ...)
  2024-03-04 10:12 ` [patch 6/9] x86/uaccess: Add missing __force to casts in __access_ok() and valid_user_address() Thomas Gleixner
@ 2024-03-04 10:12 ` Thomas Gleixner
  2024-03-04 12:15   ` [tip: x86/cleanups] " tip-bot2 for Thomas Gleixner
  2024-03-04 10:12 ` [patch 8/9] x86/cpu: Provide a declaration for itlb_multihit_kvm_mitigation Thomas Gleixner
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-04 10:12 UTC (permalink / raw)
  To: LKML; +Cc: x86, Linus Torvalds, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

sparse rightfully complains:

bugs.c:71:9: sparse: warning: incorrect type in initializer (different address spaces)
bugs.c:71:9: sparse:    expected void const [noderef] __percpu *__vpp_verify
bugs.c:71:9: sparse:    got unsigned long long *

The reason is that x86_spec_ctrl_current which is a per CPU variable is
exported with EXPORT_SYMBOL_GPL().

Use EXPORT_PER_CPU_SYMBOL_GPL() instead.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/bugs.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -56,7 +56,7 @@ EXPORT_SYMBOL_GPL(x86_spec_ctrl_base);
 
 /* The current value of the SPEC_CTRL MSR with task-specific bits set */
 DEFINE_PER_CPU(u64, x86_spec_ctrl_current);
-EXPORT_SYMBOL_GPL(x86_spec_ctrl_current);
+EXPORT_PER_CPU_SYMBOL_GPL(x86_spec_ctrl_current);
 
 u64 x86_pred_cmd __ro_after_init = PRED_CMD_IBPB;
 EXPORT_SYMBOL_GPL(x86_pred_cmd);


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [patch 8/9] x86/cpu: Provide a declaration for itlb_multihit_kvm_mitigation
  2024-03-04 10:12 [patch 0/9] x86: Cure tons of sparse warnings (mostly __percpu) Thomas Gleixner
                   ` (6 preceding siblings ...)
  2024-03-04 10:12 ` [patch 7/9] x86/cpu: Use EXPORT_PER_CPU_SYMBOL_GPL() for x86_spec_ctrl_current Thomas Gleixner
@ 2024-03-04 10:12 ` Thomas Gleixner
  2024-03-04 12:15   ` [tip: x86/cleanups] " tip-bot2 for Thomas Gleixner
  2024-03-04 10:12 ` [patch 9/9] x86/callthunks: Use EXPORT_PER_CPU_SYMBOL_GPL() for per CPU variables Thomas Gleixner
  2024-03-04 11:08 ` [patch 0/9] x86: Cure tons of sparse warnings (mostly __percpu) Ingo Molnar
  9 siblings, 1 reply; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-04 10:12 UTC (permalink / raw)
  To: LKML; +Cc: x86, Linus Torvalds, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

sparse complains rightfully about the missing declaration which has been
placed sloppily into the usage site:

bugs.c:2223:6: sparse: warning: symbol 'itlb_multihit_kvm_mitigation' was
	       	       		       not declared. Should it be static?

Add it to spec-ctrl.h where it belongs and remove the one in the kvm code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/spec-ctrl.h |    2 ++
 arch/x86/kvm/mmu/mmu.c           |    3 +--
 2 files changed, 3 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/spec-ctrl.h
+++ b/arch/x86/include/asm/spec-ctrl.h
@@ -96,4 +96,6 @@ static inline void speculative_store_byp
 extern void speculation_ctrl_update(unsigned long tif);
 extern void speculation_ctrl_update_current(void);
 
+extern bool itlb_multihit_kvm_mitigation;
+
 #endif
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -53,12 +53,11 @@
 #include <asm/cmpxchg.h>
 #include <asm/io.h>
 #include <asm/set_memory.h>
+#include <asm/spec-ctrl.h>
 #include <asm/vmx.h>
 
 #include "trace.h"
 
-extern bool itlb_multihit_kvm_mitigation;
-
 static bool nx_hugepage_mitigation_hard_disabled;
 
 int __read_mostly nx_huge_pages = -1;


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [patch 9/9] x86/callthunks: Use EXPORT_PER_CPU_SYMBOL_GPL() for per CPU variables
  2024-03-04 10:12 [patch 0/9] x86: Cure tons of sparse warnings (mostly __percpu) Thomas Gleixner
                   ` (7 preceding siblings ...)
  2024-03-04 10:12 ` [patch 8/9] x86/cpu: Provide a declaration for itlb_multihit_kvm_mitigation Thomas Gleixner
@ 2024-03-04 10:12 ` Thomas Gleixner
  2024-03-04 12:15   ` [tip: x86/cleanups] " tip-bot2 for Thomas Gleixner
  2024-03-04 11:08 ` [patch 0/9] x86: Cure tons of sparse warnings (mostly __percpu) Ingo Molnar
  9 siblings, 1 reply; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-04 10:12 UTC (permalink / raw)
  To: LKML; +Cc: x86, Linus Torvalds, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

sparse complains rightfully about the usage of EXPORT_SYMBOL_GPL() for per
CPU variables:

callthunks.c:346:20: sparse: warning: incorrect type in initializer (different address spaces)
callthunks.c:346:20: sparse:    expected void const [noderef] __percpu *__vpp_verify
callthunks.c:346:20: sparse:    got unsigned long long *

Use EXPORT_PER_CPU_SYMBOL_GPL instead.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/callthunks.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/callthunks.c
+++ b/arch/x86/kernel/callthunks.c
@@ -44,8 +44,8 @@ DEFINE_PER_CPU(u64, __x86_call_count);
 DEFINE_PER_CPU(u64, __x86_ret_count);
 DEFINE_PER_CPU(u64, __x86_stuffs_count);
 DEFINE_PER_CPU(u64, __x86_ctxsw_count);
-EXPORT_SYMBOL_GPL(__x86_ctxsw_count);
-EXPORT_SYMBOL_GPL(__x86_call_count);
+EXPORT_PER_CPU_SYMBOL_GPL(__x86_ctxsw_count);
+EXPORT_PER_CPU_SYMBOL_GPL(__x86_call_count);
 #endif
 
 extern s32 __call_sites[], __call_sites_end[];


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 0/9] x86: Cure tons of sparse warnings (mostly __percpu)
  2024-03-04 10:12 [patch 0/9] x86: Cure tons of sparse warnings (mostly __percpu) Thomas Gleixner
                   ` (8 preceding siblings ...)
  2024-03-04 10:12 ` [patch 9/9] x86/callthunks: Use EXPORT_PER_CPU_SYMBOL_GPL() for per CPU variables Thomas Gleixner
@ 2024-03-04 11:08 ` Ingo Molnar
  9 siblings, 0 replies; 41+ messages in thread
From: Ingo Molnar @ 2024-03-04 11:08 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Linus Torvalds, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all


* Thomas Gleixner <tglx@linutronix.de> wrote:

> A recent 0-day report about new __percpu related sparse warnings made me
> look deeper into it after I dismissed the report as bogus initially.
> 
> It turned out that sparse is actually right and all of these warnings (not
> only the most recent ones) are valid and got ignored. Some of them for many
> years.
> 
> The worst offender is an UP build because that maps the per CPU cpu_info to
> boot_cpu_data, which is regular data.
> 
> As a consequence all per CPU accessors which look like legit code and are
> legit code in the SMP build are causing sparse to emit warnings.
> 
> This series addresses this by:
> 
>      - Adding the missing __percpu annotations all over the place
> 
>      - Curing the UP madness by exposing a proper per CPU cpu_info for the
>        price of wasting 320 byte of memory.
> 
>        Even if the size police will hate me for that, this cures most of
>        the madness in one go and avoids to add more hideous macro mess
>        similar to the completely bogus cpu_data() one which should have
>        never been there in the first place.

The market of UP-only systems running an upstream Linux kernel is shrinking 
fast, so I doubt this is a real concern.

>        I know that there are people who think that size matters, but the
>        only things which really matter in software are correctness and
>        maintainability. The latter simply forbids to add more hideous macro
>        mess just to avoid wasting 320 bytes of memory for something which
>        is mostly a reminiscence of the good old days...
>        
>      - Fixing a few obvious non __percpu related warnings which stood out
>        prominently.
> 
> That reduces the sparse warnings in arch/x86 significantly.

Great - there's also the side benefit of reduction in <asm/processor.h> 
complexity via patch #2, which is great for ongoing work to reduce header 
depdency hell ...

I've applied your Sparse fixes to tip:x86/cleanups straight away, so they 
have a chance to make it into v6.9.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [tip: x86/cleanups] x86/callthunks: Use EXPORT_PER_CPU_SYMBOL_GPL() for per CPU variables
  2024-03-04 10:12 ` [patch 9/9] x86/callthunks: Use EXPORT_PER_CPU_SYMBOL_GPL() for per CPU variables Thomas Gleixner
@ 2024-03-04 12:15   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 41+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-04 12:15 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel

The following commit has been merged into the x86/cleanups branch of tip:

Commit-ID:     cad860b59531ba4d456b3921d5ced621620d76fc
Gitweb:        https://git.kernel.org/tip/cad860b59531ba4d456b3921d5ced621620d76fc
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Mon, 04 Mar 2024 11:12:29 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Mon, 04 Mar 2024 12:09:13 +01:00

x86/callthunks: Use EXPORT_PER_CPU_SYMBOL_GPL() for per CPU variables

Sparse complains rightfully about the usage of EXPORT_SYMBOL_GPL() for per
CPU variables:

  callthunks.c:346:20: sparse: warning: incorrect type in initializer (different address spaces)
  callthunks.c:346:20: sparse:    expected void const [noderef] __percpu *__vpp_verify
  callthunks.c:346:20: sparse:    got unsigned long long *

Use EXPORT_PER_CPU_SYMBOL_GPL() instead.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240304005104.841915535@linutronix.de
---
 arch/x86/kernel/callthunks.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/callthunks.c b/arch/x86/kernel/callthunks.c
index 64ad2dd..aeccea2 100644
--- a/arch/x86/kernel/callthunks.c
+++ b/arch/x86/kernel/callthunks.c
@@ -42,8 +42,8 @@ DEFINE_PER_CPU(u64, __x86_call_count);
 DEFINE_PER_CPU(u64, __x86_ret_count);
 DEFINE_PER_CPU(u64, __x86_stuffs_count);
 DEFINE_PER_CPU(u64, __x86_ctxsw_count);
-EXPORT_SYMBOL_GPL(__x86_ctxsw_count);
-EXPORT_SYMBOL_GPL(__x86_call_count);
+EXPORT_PER_CPU_SYMBOL_GPL(__x86_ctxsw_count);
+EXPORT_PER_CPU_SYMBOL_GPL(__x86_call_count);
 #endif
 
 extern s32 __call_sites[], __call_sites_end[];

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [tip: x86/cleanups] x86/cpu: Provide a declaration for itlb_multihit_kvm_mitigation
  2024-03-04 10:12 ` [patch 8/9] x86/cpu: Provide a declaration for itlb_multihit_kvm_mitigation Thomas Gleixner
@ 2024-03-04 12:15   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 41+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-04 12:15 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel

The following commit has been merged into the x86/cleanups branch of tip:

Commit-ID:     65efc4dc12c5cc296374278673b89390eba79fe6
Gitweb:        https://git.kernel.org/tip/65efc4dc12c5cc296374278673b89390eba79fe6
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Mon, 04 Mar 2024 11:12:28 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Mon, 04 Mar 2024 12:09:13 +01:00

x86/cpu: Provide a declaration for itlb_multihit_kvm_mitigation

Sparse complains rightfully about the missing declaration which has been
placed sloppily into the usage site:

  bugs.c:2223:6: sparse: warning: symbol 'itlb_multihit_kvm_mitigation' was not declared. Should it be static?

Add it to <asm/spec-ctrl.h> where it belongs and remove the one in the KVM code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240304005104.787173239@linutronix.de
---
 arch/x86/include/asm/spec-ctrl.h | 2 ++
 arch/x86/kvm/mmu/mmu.c           | 3 +--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h
index c648502..658b690 100644
--- a/arch/x86/include/asm/spec-ctrl.h
+++ b/arch/x86/include/asm/spec-ctrl.h
@@ -96,4 +96,6 @@ static inline void speculative_store_bypass_ht_init(void) { }
 extern void speculation_ctrl_update(unsigned long tif);
 extern void speculation_ctrl_update_current(void);
 
+extern bool itlb_multihit_kvm_mitigation;
+
 #endif
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 2d6cdea..3c89d3e 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -53,12 +53,11 @@
 #include <asm/cmpxchg.h>
 #include <asm/io.h>
 #include <asm/set_memory.h>
+#include <asm/spec-ctrl.h>
 #include <asm/vmx.h>
 
 #include "trace.h"
 
-extern bool itlb_multihit_kvm_mitigation;
-
 static bool nx_hugepage_mitigation_hard_disabled;
 
 int __read_mostly nx_huge_pages = -1;

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [tip: x86/cleanups] x86/cpu: Use EXPORT_PER_CPU_SYMBOL_GPL() for x86_spec_ctrl_current
  2024-03-04 10:12 ` [patch 7/9] x86/cpu: Use EXPORT_PER_CPU_SYMBOL_GPL() for x86_spec_ctrl_current Thomas Gleixner
@ 2024-03-04 12:15   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 41+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-04 12:15 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel

The following commit has been merged into the x86/cleanups branch of tip:

Commit-ID:     ca3ec9e5540428afa3d372fb39f0d88c4f44af8b
Gitweb:        https://git.kernel.org/tip/ca3ec9e5540428afa3d372fb39f0d88c4f44af8b
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Mon, 04 Mar 2024 11:12:26 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Mon, 04 Mar 2024 12:09:13 +01:00

x86/cpu: Use EXPORT_PER_CPU_SYMBOL_GPL() for x86_spec_ctrl_current

Sparse rightfully complains:

  bugs.c:71:9: sparse: warning: incorrect type in initializer (different address spaces)
  bugs.c:71:9: sparse:    expected void const [noderef] __percpu *__vpp_verify
  bugs.c:71:9: sparse:    got unsigned long long *

The reason is that x86_spec_ctrl_current which is a per CPU variable is
exported with EXPORT_SYMBOL_GPL().

Use EXPORT_PER_CPU_SYMBOL_GPL() instead.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240304005104.732288812@linutronix.de
---
 arch/x86/kernel/cpu/bugs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 48d049c..0767ab6 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -56,7 +56,7 @@ EXPORT_SYMBOL_GPL(x86_spec_ctrl_base);
 
 /* The current value of the SPEC_CTRL MSR with task-specific bits set */
 DEFINE_PER_CPU(u64, x86_spec_ctrl_current);
-EXPORT_SYMBOL_GPL(x86_spec_ctrl_current);
+EXPORT_PER_CPU_SYMBOL_GPL(x86_spec_ctrl_current);
 
 u64 x86_pred_cmd __ro_after_init = PRED_CMD_IBPB;
 EXPORT_SYMBOL_GPL(x86_pred_cmd);

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [tip: x86/cleanups] x86/uaccess: Add missing __force to casts in __access_ok() and valid_user_address()
  2024-03-04 10:12 ` [patch 6/9] x86/uaccess: Add missing __force to casts in __access_ok() and valid_user_address() Thomas Gleixner
@ 2024-03-04 12:15   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 41+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-04 12:15 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel

The following commit has been merged into the x86/cleanups branch of tip:

Commit-ID:     ae6b0195f5c503f22673a4acf21111822305c1e0
Gitweb:        https://git.kernel.org/tip/ae6b0195f5c503f22673a4acf21111822305c1e0
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Mon, 04 Mar 2024 11:12:25 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Mon, 04 Mar 2024 12:09:12 +01:00

x86/uaccess: Add missing __force to casts in __access_ok() and valid_user_address()

Sparse complains about losing the __user address space due to the cast to
long:

  uaccess_64.h:88:24: sparse: warning: cast removes address space '__user' of expression

Annotate it with __force to tell sparse that this is intentional.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240304005104.677606054@linutronix.de
---
 arch/x86/include/asm/uaccess_64.h | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
index f2c02e4..d2a0c2a 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -54,7 +54,7 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm,
  * half and a user half.  When cast to a signed type, user pointers
  * are positive and kernel pointers are negative.
  */
-#define valid_user_address(x) ((long)(x) >= 0)
+#define valid_user_address(x) ((__force long)(x) >= 0)
 
 /*
  * User pointers can have tag bits on x86-64.  This scheme tolerates
@@ -87,8 +87,9 @@ static inline bool __access_ok(const void __user *ptr, unsigned long size)
 	if (__builtin_constant_p(size <= PAGE_SIZE) && size <= PAGE_SIZE) {
 		return valid_user_address(ptr);
 	} else {
-		unsigned long sum = size + (unsigned long)ptr;
-		return valid_user_address(sum) && sum >= (unsigned long)ptr;
+		unsigned long sum = size + (__force unsigned long)ptr;
+
+		return valid_user_address(sum) && sum >= (__force unsigned long)ptr;
 	}
 }
 #define __access_ok __access_ok

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [tip: x86/cleanups] x86/percpu: Cure per CPU madness on UP
  2024-03-04 10:12 ` [patch 5/9] x86: Cure per CPU madness on UP Thomas Gleixner
@ 2024-03-04 12:15   ` tip-bot2 for Thomas Gleixner
  2024-03-15 16:17   ` [patch 5/9] x86: " Guenter Roeck
  1 sibling, 0 replies; 41+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-04 12:15 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel

The following commit has been merged into the x86/cleanups branch of tip:

Commit-ID:     71eb4893cfaf37f8884515c8f71717044b97bf44
Gitweb:        https://git.kernel.org/tip/71eb4893cfaf37f8884515c8f71717044b97bf44
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Mon, 04 Mar 2024 11:12:23 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Mon, 04 Mar 2024 12:09:07 +01:00

x86/percpu: Cure per CPU madness on UP

On UP builds Sparse complains rightfully about accesses to cpu_info with
per CPU accessors:

  cacheinfo.c:282:30: sparse: warning: incorrect type in initializer (different address spaces)
  cacheinfo.c:282:30: sparse:    expected void const [noderef] __percpu *__vpp_verify
  cacheinfo.c:282:30: sparse:    got unsigned int *

The reason is that on UP builds cpu_info which is a per CPU variable on SMP
is mapped to boot_cpu_info which is a regular variable. There is a hideous
accessor cpu_data() which tries to hide this, but it's not sufficient as
some places require raw accessors and generates worse code than the regular
per CPU accessors.

Waste sizeof(struct x86_cpuinfo) memory on UP and provide the per CPU
cpu_info unconditionally. This requires to update the CPU info on the boot
CPU as SMP does. (Ab)use the weakly defined smp_prepare_boot_cpu() function
and implement exactly that.

This allows to use regular per CPU accessors uncoditionally and paves the
way to remove the cpu_data() hackery.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240304005104.622511517@linutronix.de
---
 arch/x86/include/asm/processor.h |  5 -----
 arch/x86/kernel/cpu/common.c     |  3 +++
 arch/x86/kernel/setup.c          | 10 ++++++++++
 arch/x86/kernel/smpboot.c        |  4 ----
 4 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a61f769..e2262ac 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -185,13 +185,8 @@ extern struct cpuinfo_x86	new_cpu_data;
 extern __u32			cpu_caps_cleared[NCAPINTS + NBUGINTS];
 extern __u32			cpu_caps_set[NCAPINTS + NBUGINTS];
 
-#ifdef CONFIG_SMP
 DECLARE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
 #define cpu_data(cpu)		per_cpu(cpu_info, cpu)
-#else
-#define cpu_info		boot_cpu_data
-#define cpu_data(cpu)		boot_cpu_data
-#endif
 
 extern const struct seq_operations cpuinfo_op;
 
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index fbc4e60..6057a9e 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -70,6 +70,9 @@
 
 #include "cpu.h"
 
+DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
+EXPORT_PER_CPU_SYMBOL(cpu_info);
+
 u32 elf_hwcap2 __read_mostly;
 
 /* Number of siblings per CPU package */
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 8420107..8f669d3 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1211,6 +1211,16 @@ void __init i386_reserve_resources(void)
 
 #endif /* CONFIG_X86_32 */
 
+#ifndef CONFIG_SMP
+void __init smp_prepare_boot_cpu(void)
+{
+	struct cpuinfo_x86 *c = &cpu_data(0);
+
+	*c = boot_cpu_data;
+	c->initialized = true;
+}
+#endif
+
 static struct notifier_block kernel_offset_notifier = {
 	.notifier_call = dump_kernel_offset
 };
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 980782b..37ea8c8 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -101,10 +101,6 @@ EXPORT_PER_CPU_SYMBOL(cpu_core_map);
 DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_die_map);
 EXPORT_PER_CPU_SYMBOL(cpu_die_map);
 
-/* Per CPU bogomips and other parameters */
-DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
-EXPORT_PER_CPU_SYMBOL(cpu_info);
-
 /* CPUs which are the primary SMT threads */
 struct cpumask __cpu_primary_thread_mask __read_mostly;
 

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [tip: x86/cleanups] smp: Consolidate smp_prepare_boot_cpu()
  2024-03-04 10:12 ` [patch 4/9] smp: Consolidate smp_prepare_boot_cpu() Thomas Gleixner
@ 2024-03-04 12:15   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 41+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-04 12:15 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel

The following commit has been merged into the x86/cleanups branch of tip:

Commit-ID:     712610725c48c829e42bebfc9908cd92468e2731
Gitweb:        https://git.kernel.org/tip/712610725c48c829e42bebfc9908cd92468e2731
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Mon, 04 Mar 2024 11:12:22 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Mon, 04 Mar 2024 12:01:54 +01:00

smp: Consolidate smp_prepare_boot_cpu()

There is no point in having seven architectures implementing the same empty
stub.

Provide a weak function in the init code and remove the stubs.

This also allows to utilize the function on UP which is required to
sanitize the per CPU handling on X86 UP.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240304005104.567671691@linutronix.de
---
 arch/alpha/kernel/smp.c     |  5 -----
 arch/arc/kernel/smp.c       |  5 -----
 arch/csky/kernel/smp.c      |  4 ----
 arch/hexagon/kernel/smp.c   |  4 ----
 arch/openrisc/kernel/smp.c  |  4 ----
 arch/riscv/kernel/smpboot.c |  4 ----
 arch/sparc/kernel/smp_64.c  |  4 ----
 arch/x86/include/asm/smp.h  |  5 -----
 arch/x86/kernel/smpboot.c   |  5 +++++
 include/linux/smp.h         | 13 ++++++-------
 init/main.c                 |  4 ++++
 11 files changed, 15 insertions(+), 42 deletions(-)

diff --git a/arch/alpha/kernel/smp.c b/arch/alpha/kernel/smp.c
index 7439b23..8e9dd63 100644
--- a/arch/alpha/kernel/smp.c
+++ b/arch/alpha/kernel/smp.c
@@ -467,11 +467,6 @@ smp_prepare_cpus(unsigned int max_cpus)
 	smp_num_cpus = smp_num_probed;
 }
 
-void
-smp_prepare_boot_cpu(void)
-{
-}
-
 int
 __cpu_up(unsigned int cpu, struct task_struct *tidle)
 {
diff --git a/arch/arc/kernel/smp.c b/arch/arc/kernel/smp.c
index 8d9b188..b2f2c59 100644
--- a/arch/arc/kernel/smp.c
+++ b/arch/arc/kernel/smp.c
@@ -39,11 +39,6 @@ struct plat_smp_ops  __weak plat_smp_ops;
 /* XXX: per cpu ? Only needed once in early secondary boot */
 struct task_struct *secondary_idle_tsk;
 
-/* Called from start_kernel */
-void __init smp_prepare_boot_cpu(void)
-{
-}
-
 static int __init arc_get_cpu_map(const char *name, struct cpumask *cpumask)
 {
 	unsigned long dt_root = of_get_flat_dt_root();
diff --git a/arch/csky/kernel/smp.c b/arch/csky/kernel/smp.c
index 8e42352..92dbbf3 100644
--- a/arch/csky/kernel/smp.c
+++ b/arch/csky/kernel/smp.c
@@ -152,10 +152,6 @@ void arch_irq_work_raise(void)
 }
 #endif
 
-void __init smp_prepare_boot_cpu(void)
-{
-}
-
 void __init smp_prepare_cpus(unsigned int max_cpus)
 {
 }
diff --git a/arch/hexagon/kernel/smp.c b/arch/hexagon/kernel/smp.c
index 608884b..65e1fdf 100644
--- a/arch/hexagon/kernel/smp.c
+++ b/arch/hexagon/kernel/smp.c
@@ -114,10 +114,6 @@ void send_ipi(const struct cpumask *cpumask, enum ipi_message_type msg)
 	local_irq_restore(flags);
 }
 
-void __init smp_prepare_boot_cpu(void)
-{
-}
-
 /*
  * interrupts should already be disabled from the VM
  * SP should already be correct; need to set THREADINFO_REG
diff --git a/arch/openrisc/kernel/smp.c b/arch/openrisc/kernel/smp.c
index 1c5a2d7..86da4bc 100644
--- a/arch/openrisc/kernel/smp.c
+++ b/arch/openrisc/kernel/smp.c
@@ -57,10 +57,6 @@ static void boot_secondary(unsigned int cpu, struct task_struct *idle)
 	spin_unlock(&boot_lock);
 }
 
-void __init smp_prepare_boot_cpu(void)
-{
-}
-
 void __init smp_init_cpus(void)
 {
 	struct device_node *cpu;
diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index 519b6bd..c4ed7d9 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -42,10 +42,6 @@
 
 static DECLARE_COMPLETION(cpu_running);
 
-void __init smp_prepare_boot_cpu(void)
-{
-}
-
 void __init smp_prepare_cpus(unsigned int max_cpus)
 {
 	int cpuid;
diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c
index f3969a3..a0cc9bb 100644
--- a/arch/sparc/kernel/smp_64.c
+++ b/arch/sparc/kernel/smp_64.c
@@ -1206,10 +1206,6 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 {
 }
 
-void smp_prepare_boot_cpu(void)
-{
-}
-
 void __init smp_setup_processor_id(void)
 {
 	if (tlb_type == spitfire)
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 4fab2ed..31edeab 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -59,11 +59,6 @@ static inline void stop_other_cpus(void)
 	smp_ops.stop_other_cpus(1);
 }
 
-static inline void smp_prepare_boot_cpu(void)
-{
-	smp_ops.smp_prepare_boot_cpu();
-}
-
 static inline void smp_prepare_cpus(unsigned int max_cpus)
 {
 	smp_ops.smp_prepare_cpus(max_cpus);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 3f57ce6..980782b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1187,6 +1187,11 @@ void __init smp_prepare_cpus_common(void)
 	set_cpu_sibling_map(0);
 }
 
+void __init smp_prepare_boot_cpu(void)
+{
+	smp_ops.smp_prepare_boot_cpu();
+}
+
 #ifdef CONFIG_X86_64
 /* Establish whether parallel bringup can be supported. */
 bool __init arch_cpuhp_init_parallel_bringup(void)
diff --git a/include/linux/smp.h b/include/linux/smp.h
index e87520d..b845929 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -105,6 +105,12 @@ static inline void on_each_cpu_cond(smp_cond_func_t cond_func,
 	on_each_cpu_cond_mask(cond_func, func, info, wait, cpu_online_mask);
 }
 
+/*
+ * Architecture specific boot CPU setup.  Defined as empty weak function in
+ * init/main.c. Architectures can override it.
+ */
+void smp_prepare_boot_cpu(void);
+
 #ifdef CONFIG_SMP
 
 #include <linux/preempt.h>
@@ -171,12 +177,6 @@ void generic_smp_call_function_single_interrupt(void);
 #define generic_smp_call_function_interrupt \
 	generic_smp_call_function_single_interrupt
 
-/*
- * Mark the boot cpu "online" so that it can call console drivers in
- * printk() and can access its per-cpu storage.
- */
-void smp_prepare_boot_cpu(void);
-
 extern unsigned int setup_max_cpus;
 extern void __init setup_nr_cpu_ids(void);
 extern void __init smp_init(void);
@@ -203,7 +203,6 @@ static inline void up_smp_call_function(smp_call_func_t func, void *info)
 			(up_smp_call_function(func, info))
 
 static inline void smp_send_reschedule(int cpu) { }
-#define smp_prepare_boot_cpu()			do {} while (0)
 #define smp_call_function_many(mask, func, info, wait) \
 			(up_smp_call_function(func, info))
 static inline void call_function_init(void) { }
diff --git a/init/main.c b/init/main.c
index e24b078..d60bc4b 100644
--- a/init/main.c
+++ b/init/main.c
@@ -776,6 +776,10 @@ void __init __weak smp_setup_processor_id(void)
 {
 }
 
+void __init __weak smp_prepare_boot_cpu(void)
+{
+}
+
 # if THREAD_SIZE >= PAGE_SIZE
 void __init __weak thread_stack_cache_init(void)
 {

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [tip: x86/cleanups] x86/msr: Add missing __percpu annotations
  2024-03-04 10:12 ` [patch 3/9] x86/msr: Add missing __percpu annotations Thomas Gleixner
@ 2024-03-04 12:15   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 41+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-04 12:15 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel

The following commit has been merged into the x86/cleanups branch of tip:

Commit-ID:     5323922f50ecdf9d10cdd2fabd06507e5b4f3feb
Gitweb:        https://git.kernel.org/tip/5323922f50ecdf9d10cdd2fabd06507e5b4f3feb
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Mon, 04 Mar 2024 11:12:20 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Mon, 04 Mar 2024 12:01:54 +01:00

x86/msr: Add missing __percpu annotations

Sparse rightfully complains about using a plain pointer for per CPU
accessors:

  msr-smp.c:15:23: sparse: warning: incorrect type in initializer (different address spaces)
  msr-smp.c:15:23: sparse:    expected void const [noderef] __percpu *__vpp_verify
  msr-smp.c:15:23: sparse:    got struct msr *

Add __percpu annotations to the related datastructure and function
arguments to cure this. This also cures the related sparse warnings at the
callsites in drivers/edac/amd64_edac.c.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240304005104.513181735@linutronix.de
---
 arch/x86/include/asm/msr.h       | 26 ++++++++++++++------------
 arch/x86/include/asm/processor.h |  1 -
 arch/x86/include/asm/tsc.h       |  3 ++-
 arch/x86/lib/msr-smp.c           | 12 +++++-------
 arch/x86/lib/msr.c               |  6 +++---
 5 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 65ec196..4621e08 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -12,11 +12,13 @@
 #include <uapi/asm/msr.h>
 #include <asm/shared/msr.h>
 
+#include <linux/percpu.h>
+
 struct msr_info {
-	u32 msr_no;
-	struct msr reg;
-	struct msr *msrs;
-	int err;
+	u32			msr_no;
+	struct msr		reg;
+	struct msr __percpu	*msrs;
+	int			err;
 };
 
 struct msr_regs_info {
@@ -305,8 +307,8 @@ static inline int wrmsrl_safe(u32 msr, u64 val)
 	return wrmsr_safe(msr, (u32)val,  (u32)(val >> 32));
 }
 
-struct msr *msrs_alloc(void);
-void msrs_free(struct msr *msrs);
+struct msr __percpu *msrs_alloc(void);
+void msrs_free(struct msr __percpu *msrs);
 int msr_set_bit(u32 msr, u8 bit);
 int msr_clear_bit(u32 msr, u8 bit);
 
@@ -315,8 +317,8 @@ int rdmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 *l, u32 *h);
 int wrmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h);
 int rdmsrl_on_cpu(unsigned int cpu, u32 msr_no, u64 *q);
 int wrmsrl_on_cpu(unsigned int cpu, u32 msr_no, u64 q);
-void rdmsr_on_cpus(const struct cpumask *mask, u32 msr_no, struct msr *msrs);
-void wrmsr_on_cpus(const struct cpumask *mask, u32 msr_no, struct msr *msrs);
+void rdmsr_on_cpus(const struct cpumask *mask, u32 msr_no, struct msr __percpu *msrs);
+void wrmsr_on_cpus(const struct cpumask *mask, u32 msr_no, struct msr __percpu *msrs);
 int rdmsr_safe_on_cpu(unsigned int cpu, u32 msr_no, u32 *l, u32 *h);
 int wrmsr_safe_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h);
 int rdmsrl_safe_on_cpu(unsigned int cpu, u32 msr_no, u64 *q);
@@ -345,14 +347,14 @@ static inline int wrmsrl_on_cpu(unsigned int cpu, u32 msr_no, u64 q)
 	return 0;
 }
 static inline void rdmsr_on_cpus(const struct cpumask *m, u32 msr_no,
-				struct msr *msrs)
+				struct msr __percpu *msrs)
 {
-	rdmsr_on_cpu(0, msr_no, &(msrs[0].l), &(msrs[0].h));
+	rdmsr_on_cpu(0, msr_no, raw_cpu_ptr(&msrs->l), raw_cpu_ptr(&msrs->h));
 }
 static inline void wrmsr_on_cpus(const struct cpumask *m, u32 msr_no,
-				struct msr *msrs)
+				struct msr __percpu *msrs)
 {
-	wrmsr_on_cpu(0, msr_no, msrs[0].l, msrs[0].h);
+	wrmsr_on_cpu(0, msr_no, raw_cpu_read(msrs->l), raw_cpu_read(msrs->h));
 }
 static inline int rdmsr_safe_on_cpu(unsigned int cpu, u32 msr_no,
 				    u32 *l, u32 *h)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index d2ef4f5..a61f769 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -20,7 +20,6 @@ struct vm86;
 #include <asm/page.h>
 #include <asm/pgtable_types.h>
 #include <asm/percpu.h>
-#include <asm/msr.h>
 #include <asm/desc_defs.h>
 #include <asm/nops.h>
 #include <asm/special_insns.h>
diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index 594fce0..405efb3 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -5,8 +5,9 @@
 #ifndef _ASM_X86_TSC_H
 #define _ASM_X86_TSC_H
 
-#include <asm/processor.h>
 #include <asm/cpufeature.h>
+#include <asm/processor.h>
+#include <asm/msr.h>
 
 /*
  * Standard way to access the cycle counter.
diff --git a/arch/x86/lib/msr-smp.c b/arch/x86/lib/msr-smp.c
index 40bbe56..acd463d 100644
--- a/arch/x86/lib/msr-smp.c
+++ b/arch/x86/lib/msr-smp.c
@@ -9,10 +9,9 @@ static void __rdmsr_on_cpu(void *info)
 {
 	struct msr_info *rv = info;
 	struct msr *reg;
-	int this_cpu = raw_smp_processor_id();
 
 	if (rv->msrs)
-		reg = per_cpu_ptr(rv->msrs, this_cpu);
+		reg = this_cpu_ptr(rv->msrs);
 	else
 		reg = &rv->reg;
 
@@ -23,10 +22,9 @@ static void __wrmsr_on_cpu(void *info)
 {
 	struct msr_info *rv = info;
 	struct msr *reg;
-	int this_cpu = raw_smp_processor_id();
 
 	if (rv->msrs)
-		reg = per_cpu_ptr(rv->msrs, this_cpu);
+		reg = this_cpu_ptr(rv->msrs);
 	else
 		reg = &rv->reg;
 
@@ -97,7 +95,7 @@ int wrmsrl_on_cpu(unsigned int cpu, u32 msr_no, u64 q)
 EXPORT_SYMBOL(wrmsrl_on_cpu);
 
 static void __rwmsr_on_cpus(const struct cpumask *mask, u32 msr_no,
-			    struct msr *msrs,
+			    struct msr __percpu *msrs,
 			    void (*msr_func) (void *info))
 {
 	struct msr_info rv;
@@ -124,7 +122,7 @@ static void __rwmsr_on_cpus(const struct cpumask *mask, u32 msr_no,
  * @msrs:       array of MSR values
  *
  */
-void rdmsr_on_cpus(const struct cpumask *mask, u32 msr_no, struct msr *msrs)
+void rdmsr_on_cpus(const struct cpumask *mask, u32 msr_no, struct msr __percpu *msrs)
 {
 	__rwmsr_on_cpus(mask, msr_no, msrs, __rdmsr_on_cpu);
 }
@@ -138,7 +136,7 @@ EXPORT_SYMBOL(rdmsr_on_cpus);
  * @msrs:       array of MSR values
  *
  */
-void wrmsr_on_cpus(const struct cpumask *mask, u32 msr_no, struct msr *msrs)
+void wrmsr_on_cpus(const struct cpumask *mask, u32 msr_no, struct msr __percpu *msrs)
 {
 	__rwmsr_on_cpus(mask, msr_no, msrs, __wrmsr_on_cpu);
 }
diff --git a/arch/x86/lib/msr.c b/arch/x86/lib/msr.c
index 47fd9bd..4bf4fad 100644
--- a/arch/x86/lib/msr.c
+++ b/arch/x86/lib/msr.c
@@ -6,9 +6,9 @@
 #define CREATE_TRACE_POINTS
 #include <asm/msr-trace.h>
 
-struct msr *msrs_alloc(void)
+struct msr __percpu *msrs_alloc(void)
 {
-	struct msr *msrs = NULL;
+	struct msr __percpu *msrs = NULL;
 
 	msrs = alloc_percpu(struct msr);
 	if (!msrs) {
@@ -20,7 +20,7 @@ struct msr *msrs_alloc(void)
 }
 EXPORT_SYMBOL(msrs_alloc);
 
-void msrs_free(struct msr *msrs)
+void msrs_free(struct msr __percpu *msrs)
 {
 	free_percpu(msrs);
 }

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [tip: x86/cleanups] x86/msr: Prepare for including <linux/percpu.h> into <asm/msr.h>
  2024-03-04 10:12 ` [patch 2/9] x86/msr: Prepare for including percpu.h Thomas Gleixner
@ 2024-03-04 12:15   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 41+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-04 12:15 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel

The following commit has been merged into the x86/cleanups branch of tip:

Commit-ID:     154fcf3a788868cb87d8c2e50c0b5b3a2fe89853
Gitweb:        https://git.kernel.org/tip/154fcf3a788868cb87d8c2e50c0b5b3a2fe89853
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Mon, 04 Mar 2024 11:12:19 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Mon, 04 Mar 2024 12:01:39 +01:00

x86/msr: Prepare for including <linux/percpu.h> into <asm/msr.h>

To clean up the per CPU insanity of UP which causes sparse to be rightfully
unhappy and prevents the usage of the generic per CPU accessors on cpu_info
it is necessary to include <linux/percpu.h> into <asm/msr.h>.

Including <linux/percpu.h> into <asm/msr.h> is impossible because it ends
up in header dependency hell. The problem is that <asm/processor.h>
includes <asm/msr.h>. The inclusion of <linux/percpu.h> results in a
compile fail where the compiler cannot longer handle an include in
<asm/cpufeature.h> which references boot_cpu_data which is
defined in <asm/processor.h>.

The only reason why <asm/msr.h> is included in <asm/processor.h> are the
set/get_debugctlmsr() inlines. They are defined there because <asm/processor.h>
is such a nice dump ground for everything. In fact they belong obviously
into <asm/debugreg.h>.

Move them to <asm/debugreg.h> and fix up the resulting damage which is just
exposing the reliance on random include chains.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240304005104.454678686@linutronix.de
---
 arch/x86/events/intel/core.c         |  1 +
 arch/x86/events/intel/ds.c           |  1 +
 arch/x86/include/asm/debugreg.h      | 24 ++++++++++++++++++++++++
 arch/x86/include/asm/fsgsbase.h      |  2 +-
 arch/x86/include/asm/processor.h     | 22 ----------------------
 arch/x86/include/asm/special_insns.h |  4 ++--
 arch/x86/kernel/cpu/intel_pconfig.c  |  2 ++
 arch/x86/kernel/cpu/rdrand.c         |  1 +
 arch/x86/kernel/fpu/bugs.c           |  2 ++
 arch/x86/kernel/step.c               |  2 ++
 10 files changed, 36 insertions(+), 25 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 3804f21..768d141 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -17,6 +17,7 @@
 #include <linux/kvm_host.h>
 
 #include <asm/cpufeature.h>
+#include <asm/debugreg.h>
 #include <asm/hardirq.h>
 #include <asm/intel-family.h>
 #include <asm/intel_pt.h>
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index d49d661..2641ba6 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -5,6 +5,7 @@
 #include <linux/sched/clock.h>
 
 #include <asm/cpu_entry_area.h>
+#include <asm/debugreg.h>
 #include <asm/perf_event.h>
 #include <asm/tlbflush.h>
 #include <asm/insn.h>
diff --git a/arch/x86/include/asm/debugreg.h b/arch/x86/include/asm/debugreg.h
index 0cec92c..fdbbbfe 100644
--- a/arch/x86/include/asm/debugreg.h
+++ b/arch/x86/include/asm/debugreg.h
@@ -5,7 +5,9 @@
 #include <linux/bug.h>
 #include <linux/percpu.h>
 #include <uapi/asm/debugreg.h>
+
 #include <asm/cpufeature.h>
+#include <asm/msr.h>
 
 DECLARE_PER_CPU(unsigned long, cpu_dr7);
 
@@ -159,4 +161,26 @@ static inline unsigned long amd_get_dr_addr_mask(unsigned int dr)
 }
 #endif
 
+static inline unsigned long get_debugctlmsr(void)
+{
+	unsigned long debugctlmsr = 0;
+
+#ifndef CONFIG_X86_DEBUGCTLMSR
+	if (boot_cpu_data.x86 < 6)
+		return 0;
+#endif
+	rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctlmsr);
+
+	return debugctlmsr;
+}
+
+static inline void update_debugctlmsr(unsigned long debugctlmsr)
+{
+#ifndef CONFIG_X86_DEBUGCTLMSR
+	if (boot_cpu_data.x86 < 6)
+		return;
+#endif
+	wrmsrl(MSR_IA32_DEBUGCTLMSR, debugctlmsr);
+}
+
 #endif /* _ASM_X86_DEBUGREG_H */
diff --git a/arch/x86/include/asm/fsgsbase.h b/arch/x86/include/asm/fsgsbase.h
index 35cff5f..9e7e8ca 100644
--- a/arch/x86/include/asm/fsgsbase.h
+++ b/arch/x86/include/asm/fsgsbase.h
@@ -6,7 +6,7 @@
 
 #ifdef CONFIG_X86_64
 
-#include <asm/msr-index.h>
+#include <asm/msr.h>
 
 /*
  * Read/write a task's FSBASE or GSBASE. This returns the value that
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 26620d7..d2ef4f5 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -576,28 +576,6 @@ extern void cpu_init(void);
 extern void cpu_init_exception_handling(void);
 extern void cr4_init(void);
 
-static inline unsigned long get_debugctlmsr(void)
-{
-	unsigned long debugctlmsr = 0;
-
-#ifndef CONFIG_X86_DEBUGCTLMSR
-	if (boot_cpu_data.x86 < 6)
-		return 0;
-#endif
-	rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctlmsr);
-
-	return debugctlmsr;
-}
-
-static inline void update_debugctlmsr(unsigned long debugctlmsr)
-{
-#ifndef CONFIG_X86_DEBUGCTLMSR
-	if (boot_cpu_data.x86 < 6)
-		return;
-#endif
-	wrmsrl(MSR_IA32_DEBUGCTLMSR, debugctlmsr);
-}
-
 extern void set_task_blockstep(struct task_struct *task, bool on);
 
 /* Boot loader type from the setup header: */
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 48f8dd4..f13df37 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -2,11 +2,11 @@
 #ifndef _ASM_X86_SPECIAL_INSNS_H
 #define _ASM_X86_SPECIAL_INSNS_H
 
-
 #ifdef __KERNEL__
-
 #include <asm/nops.h>
 #include <asm/processor-flags.h>
+
+#include <linux/errno.h>
 #include <linux/irqflags.h>
 #include <linux/jump_label.h>
 
diff --git a/arch/x86/kernel/cpu/intel_pconfig.c b/arch/x86/kernel/cpu/intel_pconfig.c
index 0771a90..5be2b17 100644
--- a/arch/x86/kernel/cpu/intel_pconfig.c
+++ b/arch/x86/kernel/cpu/intel_pconfig.c
@@ -7,6 +7,8 @@
  * Author:
  *	Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
  */
+#include <linux/bug.h>
+#include <linux/limits.h>
 
 #include <asm/cpufeature.h>
 #include <asm/intel_pconfig.h>
diff --git a/arch/x86/kernel/cpu/rdrand.c b/arch/x86/kernel/cpu/rdrand.c
index 26a427f..eeac00d 100644
--- a/arch/x86/kernel/cpu/rdrand.c
+++ b/arch/x86/kernel/cpu/rdrand.c
@@ -6,6 +6,7 @@
  * Authors: Fenghua Yu <fenghua.yu@intel.com>,
  *          H. Peter Anvin <hpa@linux.intel.com>
  */
+#include <linux/printk.h>
 
 #include <asm/processor.h>
 #include <asm/archrandom.h>
diff --git a/arch/x86/kernel/fpu/bugs.c b/arch/x86/kernel/fpu/bugs.c
index a06b876..edbafc5 100644
--- a/arch/x86/kernel/fpu/bugs.c
+++ b/arch/x86/kernel/fpu/bugs.c
@@ -2,6 +2,8 @@
 /*
  * x86 FPU bug checks:
  */
+#include <linux/printk.h>
+
 #include <asm/cpufeature.h>
 #include <asm/fpu/api.h>
 
diff --git a/arch/x86/kernel/step.c b/arch/x86/kernel/step.c
index 8e2b255..3e29526 100644
--- a/arch/x86/kernel/step.c
+++ b/arch/x86/kernel/step.c
@@ -6,7 +6,9 @@
 #include <linux/sched/task_stack.h>
 #include <linux/mm.h>
 #include <linux/ptrace.h>
+
 #include <asm/desc.h>
+#include <asm/debugreg.h>
 #include <asm/mmu_context.h>
 
 unsigned long convert_ip_to_linear(struct task_struct *child, struct pt_regs *regs)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [tip: x86/cleanups] perf/x86/amd/uncore: Fix __percpu annotation
  2024-03-04 10:12 ` [patch 1/9] perf/x86/amd/uncore: Fix __percpu annotation Thomas Gleixner
@ 2024-03-04 12:15   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 41+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-04 12:15 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel

The following commit has been merged into the x86/cleanups branch of tip:

Commit-ID:     9eae297d5d8d87738a14010af62b2b64b9d98097
Gitweb:        https://git.kernel.org/tip/9eae297d5d8d87738a14010af62b2b64b9d98097
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Mon, 04 Mar 2024 11:12:18 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Mon, 04 Mar 2024 11:58:36 +01:00

perf/x86/amd/uncore: Fix __percpu annotation

The __percpu annotation in struct amd_uncore is confusing Sparse:

  uncore.c:649:10: sparse: warning: incorrect type in initializer (different address spaces)
  uncore.c:649:10: sparse:    expected void const [noderef] __percpu *__vpp_verify
  uncore.c:649:10: sparse:    got union amd_uncore_info *

The reason is that the __percpu annotation sits between the '*'
dereferencing operator and the member name.

Move it before the dereferencing operator to cure this.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240304005104.394845326@linutronix.de
---
 arch/x86/events/amd/uncore.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
index 5bf03c5..4ccb8fa 100644
--- a/arch/x86/events/amd/uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -71,7 +71,7 @@ union amd_uncore_info {
 };
 
 struct amd_uncore {
-	union amd_uncore_info * __percpu info;
+	union amd_uncore_info  __percpu *info;
 	struct amd_uncore_pmu *pmus;
 	unsigned int num_pmus;
 	bool init_done;

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-04 10:12 ` [patch 5/9] x86: Cure per CPU madness on UP Thomas Gleixner
  2024-03-04 12:15   ` [tip: x86/cleanups] x86/percpu: " tip-bot2 for Thomas Gleixner
@ 2024-03-15 16:17   ` Guenter Roeck
  2024-03-15 16:42     ` Linus Torvalds
  2024-03-20  8:58     ` Thomas Gleixner
  1 sibling, 2 replies; 41+ messages in thread
From: Guenter Roeck @ 2024-03-15 16:17 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Linus Torvalds, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

Hi,

On Mon, Mar 04, 2024 at 11:12:23AM +0100, Thomas Gleixner wrote:
> On UP builds sparse complains rightfully about accesses to cpu_info with
> per CPU accessors:
> 
> cacheinfo.c:282:30: sparse: warning: incorrect type in initializer (different address spaces)
> cacheinfo.c:282:30: sparse:    expected void const [noderef] __percpu *__vpp_verify
> cacheinfo.c:282:30: sparse:    got unsigned int *
> 
> The reason is that on UP builds cpu_info which is a per CPU variable on SMP
> is mapped to boot_cpu_info which is a regular variable. There is a hideous
> accessor cpu_data() which tries to hide this, but it's not sufficient as
> some places require raw accessors and generates worse code than the regular
> per CPU accessors.
> 
> Waste sizeof(struct x86_cpuinfo) memory on UP and provide the per CPU
> cpu_info unconditionally. This requires to update the CPU info on the boot
> CPU as SMP does. (Ab)use the weakly defined smp_prepare_boot_cpu() function
> and implement exactly that.
> 
> This allows to use regular per CPU accessors uncoditionally and paves the
> way to remove the cpu_data() hackery.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

This patch results in crashes when running the mainline kernel in qemu
with nosmp builds and Intel CPUs. The problem is _not_ seen on tag
x86-cleanups-2024-03-11; it is only seen in the mainline kernel. I didn't
check all of them, but it looks like AMD CPUs are not affected. The
initial bisect points to the merge of x86-cleanups-2024-03-11 into the
mainline kernel. I rebased x86-cleanups-2024-03-11 on top of the mainline
kernel; the second bisect uses that rebase as base. Reverting this patch
from the mainline kernel fixes the problem.

I don't know the code well enough to determine what is wrong.
Please let me know what I can do to help debugging the problem.

Thanks,
Guenter

----
crash log:

[    3.291087] BUG: unable to handle page fault for address: ffff9cd801f3f2a0
[    3.291087] #PF: supervisor write access in kernel mode
[    3.291087] #PF: error_code(0x0002) - not-present page
[    3.291087] PGD 60201067 P4D 60201067 PUD 0
[    3.291087] Oops: 0002 [#1] PREEMPT PTI
[    3.291087] CPU: 0 PID: 1 Comm: swapper Not tainted 6.8.0-06619-ge5e038b7ae9d #1
[    3.291087] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[    3.291087] RIP: 0010:rapl_cpu_online+0xf2/0x110
[    3.291087] Code: 05 ff 8e 07 03 40 42 0f 00 48 89 43 60 e8 56 5f 12 00 8b 15 b4 84 61 02 48 8b 05 01 8f 07 03 48 c7 83 90 00 00 00 e0 84 80 b6 <48> 89 9c d0 38 01 00 00 e9 2b ff ff ff b8 f4 ff ff ff e9 47 ff ff
[    3.291087] RSP: 0018:ffffa3d54001fdd0 EFLAGS: 00000246
[    3.291087] RAX: ffff9cd001f3f200 RBX: ffff9cd001fb34a8 RCX: 0000000000000000
[    3.291087] RDX: 00000000ffffffed RSI: 0000000000000001 RDI: ffff9cd001fb3550
[    3.291087] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000001
[    3.291087] R10: 0000000000000001 R11: 0000000000018001 R12: 0000000000000000
[    3.291087] R13: 000000000000009e R14: ffffffffb6808180 R15: ffffffffb86710e5
[    3.291087] FS:  0000000000000000(0000) GS:ffffffffb8ab0000(0000) knlGS:0000000000000000
[    3.291087] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.291087] CR2: ffff9cd801f3f2a0 CR3: 000000005e6a2000 CR4: 00000000001506f0
[    3.291087] Call Trace:
[    3.291087]  <TASK>
[    3.291087]  ? __die+0x1f/0x60
[    3.291087]  ? page_fault_oops+0x148/0x460
[    3.291087]  ? search_exception_tables+0x37/0x50
[    3.291087]  ? fixup_exception+0x21/0x320
[    3.291087]  ? exc_page_fault+0xca/0x150
[    3.291087]  ? asm_exc_page_fault+0x26/0x30
[    3.291087]  ? __pfx_rapl_cpu_online+0x10/0x10
[    3.291087]  ? rapl_cpu_online+0xf2/0x110
[    3.291087]  cpuhp_invoke_callback.constprop.0+0x117/0x3e0
[    3.291087]  __cpuhp_setup_state_cpuslocked+0x1b7/0x280
[    3.291087]  ? __pfx_rapl_cpu_online+0x10/0x10
[    3.291087]  rapl_pmu_init+0x189/0x2e0
[    3.291087]  ? __pfx_rapl_pmu_init+0x10/0x10
[    3.291087]  do_one_initcall+0x4f/0x210
[    3.291087]  kernel_init_freeable+0x166/0x290
[    3.291087]  ? __pfx_kernel_init+0x10/0x10
[    3.291087]  kernel_init+0x15/0x1b0
[    3.291087]  ret_from_fork+0x2f/0x50
[    3.291087]  ? __pfx_kernel_init+0x10/0x10
[    3.291087]  ret_from_fork_asm+0x19/0x30
[    3.291087]  </TASK>
[    3.291087] Modules linked in:
[    3.291087] CR2: ffff9cd801f3f2a0
[    3.291087] ---[ end trace 0000000000000000 ]---
[    3.291087] RIP: 0010:rapl_cpu_online+0xf2/0x110
[    3.291087] Code: 05 ff 8e 07 03 40 42 0f 00 48 89 43 60 e8 56 5f 12 00 8b 15 b4 84 61 02 48 8b 05 01 8f 07 03 48 c7 83 90 00 00 00 e0 84 80 b6 <48> 89 9c d0 38 01 00 00 e9 2b ff ff ff b8 f4 ff ff ff e9 47 ff ff
[    3.291087] RSP: 0018:ffffa3d54001fdd0 EFLAGS: 00000246
[    3.291087] RAX: ffff9cd001f3f200 RBX: ffff9cd001fb34a8 RCX: 0000000000000000
[    3.291087] RDX: 00000000ffffffed RSI: 0000000000000001 RDI: ffff9cd001fb3550
[    3.291087] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000001
[    3.291087] R10: 0000000000000001 R11: 0000000000018001 R12: 0000000000000000
[    3.291087] R13: 000000000000009e R14: ffffffffb6808180 R15: ffffffffb86710e5
[    3.291087] FS:  0000000000000000(0000) GS:ffffffffb8ab0000(0000) knlGS:0000000000000000
[    3.291087] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.291087] CR2: ffff9cd801f3f2a0 CR3: 000000005e6a2000 CR4: 00000000001506f0
[    3.291087] note: swapper[1] exited with irqs disabled
[    3.306047] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009

---
1st bisect:

# bad: [1bbeaf83dd7b5e3628b98bec66ff8fe2646e14aa] Merge tag 'perf-tools-for-v6.9-2024-03-13' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
# good: [e8f897f4afef0031fe618a8e94127a0934896aba] Linux 6.8
git bisect start 'HEAD' 'v6.8'
# bad: [9187210eee7d87eea37b45ea93454a88681894a4] Merge tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
git bisect bad 9187210eee7d87eea37b45ea93454a88681894a4
# bad: [a01c9fe32378636ae65bec8047b5de3fdb2ba5c8] Merge tag 'nfsd-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
git bisect bad a01c9fe32378636ae65bec8047b5de3fdb2ba5c8
# bad: [691632f0e86973604678d193ccfa47b9614581aa] Merge tag 's390-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
git bisect bad 691632f0e86973604678d193ccfa47b9614581aa
# good: [8ede842f669b6f78812349bbef4d1efd0fbdafce] Merge tag 'rust-6.9' of https://github.com/Rust-for-Linux/linux
git bisect good 8ede842f669b6f78812349bbef4d1efd0fbdafce
# good: [bfdb395a7cde12d83a623949ed029b0ab38d765b] Merge tag 'x86_mtrr_for_v6.9_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good bfdb395a7cde12d83a623949ed029b0ab38d765b
# bad: [685d98211273f60e38a6d361b62d7016c545297e] Merge tag 'x86-core-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 685d98211273f60e38a6d361b62d7016c545297e
# good: [b0402403e54ae9eb94ce1cbb53c7def776e97426] Merge tag 'edac_updates_for_v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
git bisect good b0402403e54ae9eb94ce1cbb53c7def776e97426
# good: [cb81deefb59de01325ab822f900c13941bfaf67f] x86/idle: Sanitize X86_BUG_AMD_E400 handling
git bisect good cb81deefb59de01325ab822f900c13941bfaf67f
# good: [73f0d1d7b4abb4a46bae1a0d8caf66e23d1138d0] Merge tag 'x86-asm-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 73f0d1d7b4abb4a46bae1a0d8caf66e23d1138d0
# good: [65efc4dc12c5cc296374278673b89390eba79fe6] x86/cpu: Provide a declaration for itlb_multihit_kvm_mitigation
git bisect good 65efc4dc12c5cc296374278673b89390eba79fe6
# good: [d69ad12c786f0a4593c48c0658043aa4a5116b09] Merge tag 'x86-build-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good d69ad12c786f0a4593c48c0658043aa4a5116b09
# good: [35ce64922c8263448e58a2b9e8d15a64e11e9b2d] x86/idle: Select idle routine only once
git bisect good 35ce64922c8263448e58a2b9e8d15a64e11e9b2d
# good: [774a86f1c885460ade4334b901919fa1d8ae6ec6] x86/nmi: Drop unused declaration of proc_nmi_enabled()
git bisect good 774a86f1c885460ade4334b901919fa1d8ae6ec6
# bad: [fcc196579aa1fc167d6778948bff69fae6116737] Merge tag 'x86-cleanups-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad fcc196579aa1fc167d6778948bff69fae6116737
# first bad commit: [fcc196579aa1fc167d6778948bff69fae6116737] Merge tag 'x86-cleanups-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

---
2nd bisect:

# bad: [6d70929c7019e265425f7a89cf72163a639d462e] x86/nmi: Drop unused declaration of proc_nmi_enabled()
# good: [d69ad12c786f0a4593c48c0658043aa4a5116b09] Merge tag 'x86-build-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect start 'HEAD' 'fcc196579aa1fc167d6778948bff69fae6116737~1'
# good: [5c157d25918ef15454c2a9a9262f7b763d9c9add] x86/msr: Add missing __percpu annotations
git bisect good 5c157d25918ef15454c2a9a9262f7b763d9c9add
# bad: [f5a6b1e2d92d825a0f73ca11e960795da336d99e] x86/uaccess: Add missing __force to casts in __access_ok() and valid_user_address()
git bisect bad f5a6b1e2d92d825a0f73ca11e960795da336d99e
# bad: [68907233f6d26a214bb79f892db8d999b15dda97] x86/percpu: Cure per CPU madness on UP
git bisect bad 68907233f6d26a214bb79f892db8d999b15dda97
# good: [053df18ceb928c8631042317a884b2842a457f1b] smp: Consolidate smp_prepare_boot_cpu()
git bisect good 053df18ceb928c8631042317a884b2842a457f1b
# first bad commit: [68907233f6d26a214bb79f892db8d999b15dda97] x86/percpu: Cure per CPU madness on UP

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-15 16:17   ` [patch 5/9] x86: " Guenter Roeck
@ 2024-03-15 16:42     ` Linus Torvalds
  2024-03-15 17:02       ` Guenter Roeck
  2024-03-15 17:40       ` Thomas Gleixner
  2024-03-20  8:58     ` Thomas Gleixner
  1 sibling, 2 replies; 41+ messages in thread
From: Linus Torvalds @ 2024-03-15 16:42 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Thomas Gleixner, LKML, x86, Uros Bizjak, linux-sparse, lkp,
	oe-kbuild-all

On Fri, 15 Mar 2024 at 09:17, Guenter Roeck <linux@roeck-us.net> wrote:
>
> [    3.291087] RIP: 0010:rapl_cpu_online+0xf2/0x110
> [    3.291087] Code: 05 ff 8e 07 03 40 42 0f 00 48 89 43 60 e8 56 5f 12 00 8b 15 b4 84 61 02 48 8b 05 01 8f 07 03 48 c7 83 90 00 00 00 e0 84 80 b6 <48> 89 9c d0 38 01 00 00 e9 2b ff ff ff b8 f4 ff ff ff e9 47 ff ff

The code is

  mov    %rax,0x60(%rbx)
  call   0x125f5f
  mov    0x26184b4(%rip),%edx
  mov    0x3078f01(%rip),%rax
  movq   $0xffffffffb68084e0,0x90(%rbx)
  mov    %rbx,0x138(%rax,%rdx,8)                <-- trapping instruction
  jmp    <backwards>

with %rdx being some index having the value 0xffffffed (-19).

That's ENODEV.

Without line numbers (if you have debug info for that kernel, it's
good to run "scripts/decode_stacktrace.sh" on stack traces) it's hard
to really know what's up, but I strongly suspect that it's this:

        rapl_pmus->pmus[topology_logical_die_id(cpu)] = pmu;

because we have

   topology_logical_die_id(cpu) ->
       (cpu_data(cpu).topo.logical_die_id)

and we have

    c->topo.logical_die_id = topology_get_logical_id(apicid, TOPO_DIE_DOMAIN);

and topology_get_logical_id() does this:

        if (lvlid >= MAX_LOCAL_APIC)
                return -ERANGE;
        if (!test_bit(lvlid, apic_maps[at_level].map))
                return -ENODEV;

so that -ENODEV is not entirely unlikely for a UP run.

This also explains why it *used* to work - that whole thing is new to
the current merge window and came in through commit ca7e91776912
("Merge tag 'x86-apic-2024-03-10' of ...").

Thomas, over to you. I wonder if maybe all those topology macros
should just return 0 on an UP build, but that
topology_get_logical_id() thing looks a bit wrong regardless.

It really shouldn't depend on local apic data for configs that may not
*have* a local apic.

                 Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-15 16:42     ` Linus Torvalds
@ 2024-03-15 17:02       ` Guenter Roeck
  2024-03-15 17:40       ` Thomas Gleixner
  1 sibling, 0 replies; 41+ messages in thread
From: Guenter Roeck @ 2024-03-15 17:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, LKML, x86, Uros Bizjak, linux-sparse, lkp,
	oe-kbuild-all

On 3/15/24 09:42, Linus Torvalds wrote:
> On Fri, 15 Mar 2024 at 09:17, Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> [    3.291087] RIP: 0010:rapl_cpu_online+0xf2/0x110
>> [    3.291087] Code: 05 ff 8e 07 03 40 42 0f 00 48 89 43 60 e8 56 5f 12 00 8b 15 b4 84 61 02 48 8b 05 01 8f 07 03 48 c7 83 90 00 00 00 e0 84 80 b6 <48> 89 9c d0 38 01 00 00 e9 2b ff ff ff b8 f4 ff ff ff e9 47 ff ff
> 
> The code is
> 
>    mov    %rax,0x60(%rbx)
>    call   0x125f5f
>    mov    0x26184b4(%rip),%edx
>    mov    0x3078f01(%rip),%rax
>    movq   $0xffffffffb68084e0,0x90(%rbx)
>    mov    %rbx,0x138(%rax,%rdx,8)                <-- trapping instruction
>    jmp    <backwards>
> 
> with %rdx being some index having the value 0xffffffed (-19).
> 
> That's ENODEV.
> 
> Without line numbers (if you have debug info for that kernel, it's
> good to run "scripts/decode_stacktrace.sh" on stack traces) it's hard

Sorry, I should have done that.

> to really know what's up, but I strongly suspect that it's this:
> 
>          rapl_pmus->pmus[topology_logical_die_id(cpu)] = pmu;

Correct:

[    2.632164] RIP: 0010:rapl_cpu_online (arch/x86/events/rapl.c:581)

which does point to that line. Here is a complete decoded backtrace:

[    2.632164] Call Trace:
[    2.632164]  <TASK>
[    2.632164] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
[    2.632164] ? page_fault_oops (arch/x86/mm/fault.c:713)
[    2.632164] ? search_exception_tables (kernel/extable.c:59)
[    2.632164] ? fixup_exception (arch/x86/mm/extable.c:328)
[    2.632164] ? exc_page_fault (arch/x86/mm/fault.c:1503 arch/x86/mm/fault.c:1563)
[    2.632164] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
[    2.632164] ? __pfx_rapl_cpu_online (arch/x86/events/rapl.c:566)
[    2.632164] ? rapl_cpu_online (arch/x86/events/rapl.c:581)
[    2.632164] cpuhp_invoke_callback.constprop.0 (kernel/cpu.c:195)
[    2.632164] __cpuhp_setup_state_cpuslocked (kernel/cpu.c:2541)
[    2.632164] ? __pfx_rapl_cpu_online (arch/x86/events/rapl.c:566)
[    2.632164] rapl_pmu_init (./include/linux/cpuhotplug.h:274 arch/x86/events/rapl.c:843)
[    2.632164] ? __pfx_rapl_pmu_init (arch/x86/events/rapl.c:816)
[    2.632164] do_one_initcall (init/main.c:1241)
[    2.632164] kernel_init_freeable (init/main.c:1302 init/main.c:1319 init/main.c:1338 init/main.c:1551)
[    2.632164] ? __pfx_kernel_init (init/main.c:1432)
[    2.632164] kernel_init (init/main.c:1442)
[    2.632164] ret_from_fork (arch/x86/kernel/process.c:153)
[    2.632164] ? __pfx_kernel_init (init/main.c:1432)
[    2.632164] ret_from_fork_asm (arch/x86/entry/entry_64.S:256)

Guenter


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-15 16:42     ` Linus Torvalds
  2024-03-15 17:02       ` Guenter Roeck
@ 2024-03-15 17:40       ` Thomas Gleixner
  2024-03-15 22:55         ` Thomas Gleixner
  1 sibling, 1 reply; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-15 17:40 UTC (permalink / raw)
  To: Linus Torvalds, Guenter Roeck
  Cc: LKML, x86, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

On Fri, Mar 15 2024 at 09:42, Linus Torvalds wrote:
> On Fri, 15 Mar 2024 at 09:17, Guenter Roeck <linux@roeck-us.net> wrote:
> Without line numbers (if you have debug info for that kernel, it's
> good to run "scripts/decode_stacktrace.sh" on stack traces) it's hard
> to really know what's up, but I strongly suspect that it's this:
>
>         rapl_pmus->pmus[topology_logical_die_id(cpu)] = pmu;
>
> because we have
>
>    topology_logical_die_id(cpu) ->
>        (cpu_data(cpu).topo.logical_die_id)
>
> and we have
>
>     c->topo.logical_die_id = topology_get_logical_id(apicid, TOPO_DIE_DOMAIN);
>
> and topology_get_logical_id() does this:
>
>         if (lvlid >= MAX_LOCAL_APIC)
>                 return -ERANGE;
>         if (!test_bit(lvlid, apic_maps[at_level].map))
>                 return -ENODEV;
>
> so that -ENODEV is not entirely unlikely for a UP run.
>
> This also explains why it *used* to work - that whole thing is new to
> the current merge window and came in through commit ca7e91776912
> ("Merge tag 'x86-apic-2024-03-10' of ...").
>
> Thomas, over to you. I wonder if maybe all those topology macros
> should just return 0 on an UP build, but that
> topology_get_logical_id() thing looks a bit wrong regardless.
>
> It really shouldn't depend on local apic data for configs that may not
> *have* a local apic.

Right. Let me look.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-15 17:40       ` Thomas Gleixner
@ 2024-03-15 22:55         ` Thomas Gleixner
  2024-03-15 23:23           ` Linus Torvalds
  2024-03-16  0:56           ` Guenter Roeck
  0 siblings, 2 replies; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-15 22:55 UTC (permalink / raw)
  To: Linus Torvalds, Guenter Roeck
  Cc: LKML, x86, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

On Fri, Mar 15 2024 at 18:40, Thomas Gleixner wrote:
> On Fri, Mar 15 2024 at 09:42, Linus Torvalds wrote:
>> On Fri, 15 Mar 2024 at 09:17, Guenter Roeck <linux@roeck-us.net> wrote:
>> Thomas, over to you. I wonder if maybe all those topology macros
>> should just return 0 on an UP build, but that
>> topology_get_logical_id() thing looks a bit wrong regardless.
>>
>> It really shouldn't depend on local apic data for configs that may not
>> *have* a local apic.
>
> Right. Let me look.

Not really. The problem is that a SMP build can run on a UP machine w/o
APIC or command line disables the APIC and will run into the exactly
same problem. The only case where we know that it is impossible is when
APIC support is disabled, which is silly but topic for a different
discussion.

So the proper thing to do is to check for num_possible_cpus() == 1 in
that function.

Sure you can argue that we could avoid it for SMP=n builds completely,
but I think the right thing to do is to aim for removing CONFIG_SMP and
make the UP build a subset of a generic SMP capable build which has
CONFIG_NR_CPUS=1, i.e. num_possible_cpus() = 1. Why?

Because it consolidates the code and makes UP use exactly the same
mechanisms as SMP which pretty much avoids the problem we see today that
UP lacks test coverage and becomes more esoteric and untested over time.

The downside is a slightly larger kernel image for such a build.

Though if we pretend that we seriously care about that 10% larger memory
footprint or about the marginal performance benefit of SMP=n on dead
hardware, then we are just taking the wrong pills.

The point is that this very commit in question was heading deliberately
into the direction of removing the by now silly differences of UP/SMP
for correctness sake. It just happened to unearth the missing check in
topology_get_logical_id(), but that check is required independent of
SMP=y/n as I pointed out above.

Thanks,

        tglx

---
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 3259b1d4fefe..118d9f7792ee 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -279,6 +279,15 @@ int topology_get_logical_id(u32 apicid, enum x86_topology_domains at_level)
 
 	if (lvlid >= MAX_LOCAL_APIC)
 		return -ERANGE;
+	/*
+	 * Spare the exercise on UP as there is only one instance at any
+	 * level and the map check below might fail because the CPU does
+	 * not have a local APIC or local APIC has been disabled on the
+	 * kernel command line.
+	 */
+	if (num_possible_cpus() == 1)
+		return 0;
+
 	if (!test_bit(lvlid, apic_maps[at_level].map))
 		return -ENODEV;
 	/* Get the number of set bits before @lvlid. */

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-15 22:55         ` Thomas Gleixner
@ 2024-03-15 23:23           ` Linus Torvalds
  2024-03-16  1:11             ` Thomas Gleixner
  2024-03-16  0:56           ` Guenter Roeck
  1 sibling, 1 reply; 41+ messages in thread
From: Linus Torvalds @ 2024-03-15 23:23 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Guenter Roeck, LKML, x86, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

On Fri, 15 Mar 2024 at 15:55, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Not really. The problem is that a SMP build can run on a UP machine w/o
> APIC or command line disables the APIC and will run into the exactly
> same problem. The only case where we know that it is impossible is when
> APIC support is disabled, which is silly but topic for a different
> discussion.

Oh, I agree - that was why I said that it shouldn't depend on a local
APIC on machines that may not even have one.

That "may not even have one" can still be a static option - we
technically allow 32-bit UP kernel to not enable X86_UP_APIC, although
it might be time to drop that option.

> So the proper thing to do is to check for num_possible_cpus() == 1 in
> that function.

I think that's _one_ proper thing. I still think that the deeper
problem is that it still looks at local apic rules even when those
rules are completely nonsensical.

For example, that MAX_LOCAL_APIC range test may not matter simply
because it's testing a constant value, but it still smells entirely
wrong to even check for that, when the system doesn't necessarily have
one.

So I think your patch may fix the immediate bug, but I think it's
still just a band-aid.

Either we should just make all machines look like they have the proper
local apic mappings, or we shouldn't look at any local apic rules AT
ALL.

So I'd rather see those apic_maps[] just be properly filled in.

> Sure you can argue that we could avoid it for SMP=n builds completely,
> but I think the right thing to do is to aim for removing CONFIG_SMP and
> make the UP build a subset of a generic SMP capable build which has
> CONFIG_NR_CPUS=1, i.e. num_possible_cpus() = 1. Why?

I wouldn't be entirely opposed to just doing that. UP has become
fairly irrelevant.

That said, UP is *not* entirely irrelevant on other architectures, and
if we drop UP support on x86, we'll be effectively dropping a lot of
coverage testing. The number of people who do cross-compilers is
pretty small.

End result: I'd *much* rather get rid of X86_UP_APIC and the "nolapic"
kernel command line, and say "even UP has to have a local APIC".

We already require a Pentium-class CPU, so in practice we already
require that local APIC setup. And yes, machines existed where it
could be turned off, but I don't think that is relevant any more.

Put another way: I think "UP config for wider build testing" is a
_lot_ more relevant than "no LAPIC support".

             Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-15 22:55         ` Thomas Gleixner
  2024-03-15 23:23           ` Linus Torvalds
@ 2024-03-16  0:56           ` Guenter Roeck
  1 sibling, 0 replies; 41+ messages in thread
From: Guenter Roeck @ 2024-03-16  0:56 UTC (permalink / raw)
  To: Thomas Gleixner, Linus Torvalds
  Cc: LKML, x86, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

On 3/15/24 15:55, Thomas Gleixner wrote:
> On Fri, Mar 15 2024 at 18:40, Thomas Gleixner wrote:
>> On Fri, Mar 15 2024 at 09:42, Linus Torvalds wrote:
>>> On Fri, 15 Mar 2024 at 09:17, Guenter Roeck <linux@roeck-us.net> wrote:
>>> Thomas, over to you. I wonder if maybe all those topology macros
>>> should just return 0 on an UP build, but that
>>> topology_get_logical_id() thing looks a bit wrong regardless.
>>>
>>> It really shouldn't depend on local apic data for configs that may not
>>> *have* a local apic.
>>
>> Right. Let me look.
> 
> Not really. The problem is that a SMP build can run on a UP machine w/o
> APIC or command line disables the APIC and will run into the exactly
> same problem. The only case where we know that it is impossible is when
> APIC support is disabled, which is silly but topic for a different
> discussion.
> 
> So the proper thing to do is to check for num_possible_cpus() == 1 in
> that function.
> 
> Sure you can argue that we could avoid it for SMP=n builds completely,
> but I think the right thing to do is to aim for removing CONFIG_SMP and
> make the UP build a subset of a generic SMP capable build which has
> CONFIG_NR_CPUS=1, i.e. num_possible_cpus() = 1. Why?
> 
> Because it consolidates the code and makes UP use exactly the same
> mechanisms as SMP which pretty much avoids the problem we see today that
> UP lacks test coverage and becomes more esoteric and untested over time.
> 
> The downside is a slightly larger kernel image for such a build.
> 
> Though if we pretend that we seriously care about that 10% larger memory
> footprint or about the marginal performance benefit of SMP=n on dead
> hardware, then we are just taking the wrong pills.
> 

FWIW, I would very much prefer for SMP=n builds to go away for x86.
I don't think anyone uses that in the real world nowadays, and I never
know if I should report problems like this one or just stop testing it.

> The point is that this very commit in question was heading deliberately
> into the direction of removing the by now silly differences of UP/SMP
> for correctness sake. It just happened to unearth the missing check in
> topology_get_logical_id(), but that check is required independent of
> SMP=y/n as I pointed out above.
> 
> Thanks,
> 
>          tglx
> 
> ---
> diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
> index 3259b1d4fefe..118d9f7792ee 100644
> --- a/arch/x86/kernel/cpu/topology.c
> +++ b/arch/x86/kernel/cpu/topology.c
> @@ -279,6 +279,15 @@ int topology_get_logical_id(u32 apicid, enum x86_topology_domains at_level)
>   
>   	if (lvlid >= MAX_LOCAL_APIC)
>   		return -ERANGE;
> +	/*
> +	 * Spare the exercise on UP as there is only one instance at any
> +	 * level and the map check below might fail because the CPU does
> +	 * not have a local APIC or local APIC has been disabled on the
> +	 * kernel command line.
> +	 */
> +	if (num_possible_cpus() == 1)
> +		return 0;
> +

That works.

Tested-by: Guenter Roeck <linux@roeck-us.net>

>   	if (!test_bit(lvlid, apic_maps[at_level].map))
>   		return -ENODEV;
>   	/* Get the number of set bits before @lvlid. */


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-15 23:23           ` Linus Torvalds
@ 2024-03-16  1:11             ` Thomas Gleixner
  2024-03-16  1:23               ` Linus Torvalds
                                 ` (3 more replies)
  0 siblings, 4 replies; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-16  1:11 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Guenter Roeck, LKML, x86, Uros Bizjak, linux-sparse, lkp,
	oe-kbuild-all, Arnd Bergmann

On Fri, Mar 15 2024 at 16:23, Linus Torvalds wrote:
> On Fri, 15 Mar 2024 at 15:55, Thomas Gleixner <tglx@linutronix.de> wrote:
>> So the proper thing to do is to check for num_possible_cpus() == 1 in
>> that function.
>
> I think that's _one_ proper thing. I still think that the deeper
> problem is that it still looks at local apic rules even when those
> rules are completely nonsensical.
>
> For example, that MAX_LOCAL_APIC range test may not matter simply
> because it's testing a constant value, but it still smells entirely
> wrong to even check for that, when the system doesn't necessarily have
> one.

cpu_info.apic_id defaults to 0, so unless the calling code is completely
broken it will be correct. And I rather catch the case of calling code
being broken in the !APIC case if we still want to support systems
without a local APIC.

> So I think your patch may fix the immediate bug, but I think it's
> still just a band-aid.
>
> Either we should just make all machines look like they have the proper
> local apic mappings, or we shouldn't look at any local apic rules AT
> ALL.

Sure. I can simply check if there was an APIC registered instead.

>> Sure you can argue that we could avoid it for SMP=n builds completely,
>> but I think the right thing to do is to aim for removing CONFIG_SMP and
>> make the UP build a subset of a generic SMP capable build which has
>> CONFIG_NR_CPUS=1, i.e. num_possible_cpus() = 1. Why?
>
> I wouldn't be entirely opposed to just doing that. UP has become
> fairly irrelevant.
>
> That said, UP is *not* entirely irrelevant on other architectures, and
> if we drop UP support on x86, we'll be effectively dropping a lot of
> coverage testing. The number of people who do cross-compilers is
> pretty small.
>
> End result: I'd *much* rather get rid of X86_UP_APIC and the "nolapic"
> kernel command line, and say "even UP has to have a local APIC".
>
> We already require a Pentium-class CPU, so in practice we already
> require that local APIC setup. And yes, machines existed where it
> could be turned off, but I don't think that is relevant any more.

You wish. We still support 486 and some of the still produced 486 clones
do not have a local APIC.

Not that I care and yes I'm all for getting rid of CONFIG_.*_APIC and of
the related config/command line options. If we refuse to boot on
hardware which does not enumerate an APIC then even better.

But that is only a part of the overall problem.

> Put another way: I think "UP config for wider build testing" is a
> _lot_ more relevant than "no LAPIC support".

I really have to disagree here.

The concept of making UP a proper subset of SMP has absolutely nothing
to do with x86 and UP test coverage.

We want SMP as a general concept and overhaul the whole kernel to get
rid of this ever increasing non-sensical UP burden. The real world UP
small system use cases have moved over to other OSes like Zephyr & Co
long ago.

Just because some esoteric architectures (m68k comes to my mind) will
have serious issues with that for the very wrong reasons does not mean
that we should not go there.

It's going to be quite some effort, but the overall benefit is worth it.

OTOH, it's absolutely not rocket science to pretend to be SMP capable
and if some architectures fail to accomodate on the way then we just
should remove them as that's a clear sign of being unmaintained and
irrelevant.

The amount of untested SMP=n code in the kernel becomes frigthening and
your argument that build coverage is making a difference is wishful
thinking at best.

Anything else than making the kernel SMP capable and making UP builds a
well defined subset via CONFIG_NR_CPUS=1 is a complete waste of time and
effort.

If your intention is to indulge in the historical glory of Linux running
on any (by now) irrelevant hardware on the planet, then I stop arguing
right here.

If not, can we please have a serious discussion about going SMP only and
making UP the simple and obvious NR_CPUS=1 subset?

The amount of subtle SMP=n fallout has been kinda exponentially
increasing over the years and it's just putting burden on the wrong
people. TBH, I'm tired of this nonsense.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-16  1:11             ` Thomas Gleixner
@ 2024-03-16  1:23               ` Linus Torvalds
  2024-03-16 21:34                 ` Arnd Bergmann
  2024-03-17 21:03               ` David Laight
                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 41+ messages in thread
From: Linus Torvalds @ 2024-03-16  1:23 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Guenter Roeck, LKML, x86, Uros Bizjak, linux-sparse, lkp,
	oe-kbuild-all, Arnd Bergmann

On Fri, 15 Mar 2024 at 18:11, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> You wish. We still support 486 and some of the still produced 486 clones
> do not have a local APIC.

Ouch. I was _sure_ we had dropped i486 support too due to cmpxchg8b.

But apparently that was just a discussion, and my wishful thinking,
and we never actually followed through.

         Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-16  1:23               ` Linus Torvalds
@ 2024-03-16 21:34                 ` Arnd Bergmann
  0 siblings, 0 replies; 41+ messages in thread
From: Arnd Bergmann @ 2024-03-16 21:34 UTC (permalink / raw)
  To: Linus Torvalds, Thomas Gleixner
  Cc: Guenter Roeck, LKML, x86, Uros Bizjak, linux-sparse,
	kernel test robot, oe-kbuild-all

On Sat, Mar 16, 2024, at 02:23, Linus Torvalds wrote:
> On Fri, 15 Mar 2024 at 18:11, Thomas Gleixner <tglx@linutronix.de> wrote:
>>
>> You wish. We still support 486 and some of the still produced 486 clones
>> do not have a local APIC.
>
> Ouch. I was _sure_ we had dropped i486 support too due to cmpxchg8b.
>
> But apparently that was just a discussion, and my wishful thinking,
> and we never actually followed through.

Maciej Rozycki still cares about i486 type hardware, and he was
asking for it to be kept around in the thread following [1]

I think the best suggestion at the time was to make cmpxchg8b
a compile-time feature and I had expected Maciej to follow up with
a patch for that, but this never happend, and nobody sent a patch
to remove support 486 and the early 586 clones either.

I saw recently that there are still distros that advertise 486
support on modern kernels: Tiny Core Linux and Damn Small
Linux. Both ship with a 486 SMP kernel but fail to boot
on qemu unless an APIC is enabled (DSL also requires i686 or
higher to run userspace).

       Arnd

[1] https://lore.kernel.org/all/20220815071332.627393-9-yuzhao@google.com/

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-16  1:11             ` Thomas Gleixner
  2024-03-16  1:23               ` Linus Torvalds
@ 2024-03-17 21:03               ` David Laight
  2024-03-18 11:11               ` Thomas Gleixner
  2024-03-18 17:27               ` Thomas Gleixner
  3 siblings, 0 replies; 41+ messages in thread
From: David Laight @ 2024-03-17 21:03 UTC (permalink / raw)
  To: 'Thomas Gleixner', Linus Torvalds
  Cc: Guenter Roeck, LKML, x86, Uros Bizjak, linux-sparse, lkp,
	oe-kbuild-all, Arnd Bergmann

From: Thomas Gleixner
> Sent: 16 March 2024 01:11
...
> We want SMP as a general concept and overhaul the whole kernel to get
> rid of this ever increasing non-sensical UP burden. The real world UP
> small system use cases have moved over to other OSes like Zephyr & Co
> long ago.
> 
> Just because some esoteric architectures (m68k comes to my mind) will
> have serious issues with that for the very wrong reasons does not mean
> that we should not go there.
> 
> It's going to be quite some effort, but the overall benefit is worth it.
> 
> OTOH, it's absolutely not rocket science to pretend to be SMP capable
> and if some architectures fail to accomodate on the way then we just
> should remove them as that's a clear sign of being unmaintained and
> irrelevant.

There are fpga soft-cpu (eg Nios & Risc-V) that can run linux.
They are definitely memory constrained and really wouldn't want
most of the SMP overhead.

I'm not what it involves apart from simplified startup, compiling
out IPI and spinlocks and optimising per-cpu data.
But you wouldn't want to be running an SMP capable kernel on such systems.
x86 is a different beast - except perhaps 486.

It has to be said that I've never understood why anyone would run
Linux on a Nios-II cpu. Far too slow for anything useful (you might
get 100MHz if you are lucky), caches will be small and external memory
accesses slow.
I doubt soft RISC-V are any better (and I suspect they are worse).
We do have 4 Nios-II in the fpga image for a PCIe card.
They run very small programs (one has 2kB of code memory) to do things
that would be impossible to write (sensibly) in VHDL.
There are fpga with embedded ARM (and probably RISC-V) cores for
running real OS.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-16  1:11             ` Thomas Gleixner
  2024-03-16  1:23               ` Linus Torvalds
  2024-03-17 21:03               ` David Laight
@ 2024-03-18 11:11               ` Thomas Gleixner
  2024-03-18 17:27               ` Thomas Gleixner
  3 siblings, 0 replies; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-18 11:11 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Guenter Roeck, LKML, x86, Uros Bizjak, linux-sparse, lkp,
	oe-kbuild-all, Arnd Bergmann

On Sat, Mar 16 2024 at 02:11, Thomas Gleixner wrote:
> On Fri, Mar 15 2024 at 16:23, Linus Torvalds wrote:
>> Either we should just make all machines look like they have the proper
>> local apic mappings, or we shouldn't look at any local apic rules AT
>> ALL.
>
> Sure. I can simply check if there was an APIC registered instead.

Like the below. I'm not entirely sure though whether the sanity checks
should return an error code, which is what caused the crash Guenter
observed, but I couldn't come up with something sensible either.

Returning 0 might keep the machine alive, but does it make sense?

Thanks,

        tglx
---
 arch/x86/kernel/cpu/topology.c |   15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -277,10 +277,21 @@ int topology_get_logical_id(u32 apicid,
 	/* Remove the bits below @at_level to get the proper level ID of @apicid */
 	unsigned int lvlid = topo_apicid(apicid, at_level);
 
-	if (lvlid >= MAX_LOCAL_APIC)
+	if (WARN_ON_ONCE(lvlid >= MAX_LOCAL_APIC))
 		return -ERANGE;
-	if (!test_bit(lvlid, apic_maps[at_level].map))
+
+	/*
+	 * If there was no APIC registered, then the map check below would
+	 * fail. With no APIC this is guaranteed to be an UP system and
+	 * therefore all topology levels have only one entry and their
+	 * logical ID is obviously 0.
+	 */
+	if (topo_info.boot_cpu_apic_id == BAD_APICID)
+		return 0;
+
+	if (WARN_ON_ONCE(!test_bit(lvlid, apic_maps[at_level].map)))
 		return -ENODEV;
+
 	/* Get the number of set bits before @lvlid. */
 	return bitmap_weight(apic_maps[at_level].map, lvlid);
 }

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-16  1:11             ` Thomas Gleixner
                                 ` (2 preceding siblings ...)
  2024-03-18 11:11               ` Thomas Gleixner
@ 2024-03-18 17:27               ` Thomas Gleixner
  2024-03-18 19:13                 ` Arnd Bergmann
  3 siblings, 1 reply; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-18 17:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Guenter Roeck, LKML, x86, Uros Bizjak, linux-sparse, lkp,
	oe-kbuild-all, Arnd Bergmann

On Sat, Mar 16 2024 at 02:11, Thomas Gleixner wrote:
> On Fri, Mar 15 2024 at 16:23, Linus Torvalds wrote:
> The amount of subtle SMP=n fallout has been kinda exponentially
> increasing over the years and it's just putting burden on the wrong
> people. TBH, I'm tired of this nonsense.

And for the fun of it I hacked Kconfig to allow a SMP=y NR_CPUS=1 build
and checked the size of vmlinux:

                64-bit          32-bit
SMP, NCPUS=1    38438400        22110177
UP              38393703        21682041
Delta              44697          428076
                     0.1%              2%              

The UP savings are not really impressive...

Let me look what it actually takes to do that.

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-18 17:27               ` Thomas Gleixner
@ 2024-03-18 19:13                 ` Arnd Bergmann
  2024-03-19 16:21                   ` Thomas Gleixner
  0 siblings, 1 reply; 41+ messages in thread
From: Arnd Bergmann @ 2024-03-18 19:13 UTC (permalink / raw)
  To: Thomas Gleixner, Linus Torvalds
  Cc: Guenter Roeck, LKML, x86, Uros Bizjak, linux-sparse,
	kernel test robot, oe-kbuild-all

On Mon, Mar 18, 2024, at 18:27, Thomas Gleixner wrote:
> On Sat, Mar 16 2024 at 02:11, Thomas Gleixner wrote:
>> On Fri, Mar 15 2024 at 16:23, Linus Torvalds wrote:
>> The amount of subtle SMP=n fallout has been kinda exponentially
>> increasing over the years and it's just putting burden on the wrong
>> people. TBH, I'm tired of this nonsense.
>
> And for the fun of it I hacked Kconfig to allow a SMP=y NR_CPUS=1 build
> and checked the size of vmlinux:
>
>                 64-bit          32-bit
> SMP, NCPUS=1    38438400        22110177
> UP              38393703        21682041
> Delta              44697          428076
>                      0.1%              2%              
>
> The UP savings are not really impressive...
>
> Let me look what it actually takes to do that.

FWIW, I did some experiments a few weeks ago on 32-bit ARM,
using a fairly minimal kernel in a virtual machine, and
checking the runtime memory consumption rather than compile-time.
In a kvm guest with 32MiB RAM, I saw a difference of multiple
megabytes in memory usage:

Linux testvm 6.8.0-rc4-00410-gc02197fc9076-dirty #1 SMP PREEMPT armv7l
root@testvm:~# free
           	total   used    free  	shared  buff/cache   available
Mem:       	26932   14956   1732   	    52       12800   	11976
Swap:      	16360    3632   12728

Linux testvm 6.8.0-rc4-00410-gc02197fc9076-dirty #2 PREEMPT armv7l
root@testvm:~# free
           	total    used  	free  	shared  buff/cache   available
Mem:       	26932   13744  	5648        32       10092   	13188
Swap:      	16360    3880  	12480

There is a little difference between runs, but this does seem
significant enough to keep it. The SMP build was with
CONFIG_NR_CPUS=2 (the smallest supported compile-time number),
but running on a single-CPU qemu instance.

      Arnd

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-18 19:13                 ` Arnd Bergmann
@ 2024-03-19 16:21                   ` Thomas Gleixner
  2024-03-19 18:26                     ` Guenter Roeck
  0 siblings, 1 reply; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-19 16:21 UTC (permalink / raw)
  To: Arnd Bergmann, Linus Torvalds
  Cc: Guenter Roeck, LKML, x86, Uros Bizjak, linux-sparse,
	kernel test robot, oe-kbuild-all

On Mon, Mar 18 2024 at 20:13, Arnd Bergmann wrote:
> FWIW, I did some experiments a few weeks ago on 32-bit ARM,
> using a fairly minimal kernel in a virtual machine, and
> checking the runtime memory consumption rather than compile-time.
> In a kvm guest with 32MiB RAM, I saw a difference of multiple
> megabytes in memory usage:
>
> Linux testvm 6.8.0-rc4-00410-gc02197fc9076-dirty #1 SMP PREEMPT armv7l
> root@testvm:~# free
>            	total   used    free  	shared  buff/cache   available
> Mem:       	26932   14956   1732   	    52       12800   	11976
> Swap:      	16360    3632   12728
>
> Linux testvm 6.8.0-rc4-00410-gc02197fc9076-dirty #2 PREEMPT armv7l
> root@testvm:~# free
>            	total    used  	free  	shared  buff/cache   available
> Mem:       	26932   13744  	5648        32       10092   	13188
> Swap:      	16360    3880  	12480
>
> There is a little difference between runs, but this does seem
> significant enough to keep it. The SMP build was with
> CONFIG_NR_CPUS=2 (the smallest supported compile-time number),
> but running on a single-CPU qemu instance.

With a SMP=y, NR_CPUS=1 build on x86 64bit I get:

               total        used        free      shared  buff/cache   available
Mem:        32882056      498068    32590580        4884      128884    32383988
Swap:         998396           0      998396

Same config just SMP=n:

               total        used        free      shared  buff/cache   available
Mem:        32885804      461704    32635284        4876      119480    32424100
Swap:         998396           0      998396

So the delta for available is ~40 MiB.

But if I look at it with init=/bin/sh on the command line then the delta
is significantly different:

With a SMP=y, NR_CPUS=1 build on x86 64bit I get:

               total        used        free      shared  buff/cache   available
Mem:        32883680      324120    32822728         216       10864    32559560
Swap:              0           0           0

Same config just SMP=n:

               total        used        free      shared  buff/cache   available
Mem:        32885804      326876    32821972         216       11100    32558928
Swap:              0           0           0

Delta available = 632 KiB

I haven't had the time to stare at that in detail, but comparing
/proc/meminfo for the full boot case above does not immediately give me
a hint. It's confusing at best...

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-19 16:21                   ` Thomas Gleixner
@ 2024-03-19 18:26                     ` Guenter Roeck
  0 siblings, 0 replies; 41+ messages in thread
From: Guenter Roeck @ 2024-03-19 18:26 UTC (permalink / raw)
  To: Thomas Gleixner, Arnd Bergmann, Linus Torvalds
  Cc: LKML, x86, Uros Bizjak, linux-sparse, kernel test robot, oe-kbuild-all

On 3/19/24 09:21, Thomas Gleixner wrote:
> On Mon, Mar 18 2024 at 20:13, Arnd Bergmann wrote:
>> FWIW, I did some experiments a few weeks ago on 32-bit ARM,
>> using a fairly minimal kernel in a virtual machine, and
>> checking the runtime memory consumption rather than compile-time.
>> In a kvm guest with 32MiB RAM, I saw a difference of multiple
>> megabytes in memory usage:
>>
>> Linux testvm 6.8.0-rc4-00410-gc02197fc9076-dirty #1 SMP PREEMPT armv7l
>> root@testvm:~# free
>>             	total   used    free  	shared  buff/cache   available
>> Mem:       	26932   14956   1732   	    52       12800   	11976
>> Swap:      	16360    3632   12728
>>
>> Linux testvm 6.8.0-rc4-00410-gc02197fc9076-dirty #2 PREEMPT armv7l
>> root@testvm:~# free
>>             	total    used  	free  	shared  buff/cache   available
>> Mem:       	26932   13744  	5648        32       10092   	13188
>> Swap:      	16360    3880  	12480
>>
>> There is a little difference between runs, but this does seem
>> significant enough to keep it. The SMP build was with
>> CONFIG_NR_CPUS=2 (the smallest supported compile-time number),
>> but running on a single-CPU qemu instance.
> 
> With a SMP=y, NR_CPUS=1 build on x86 64bit I get:
> 
>                 total        used        free      shared  buff/cache   available
> Mem:        32882056      498068    32590580        4884      128884    32383988
> Swap:         998396           0      998396
> 
> Same config just SMP=n:
> 
>                 total        used        free      shared  buff/cache   available
> Mem:        32885804      461704    32635284        4876      119480    32424100
> Swap:         998396           0      998396
> 
> So the delta for available is ~40 MiB.
> 
> But if I look at it with init=/bin/sh on the command line then the delta
> is significantly different:
> 
> With a SMP=y, NR_CPUS=1 build on x86 64bit I get:
> 
>                 total        used        free      shared  buff/cache   available
> Mem:        32883680      324120    32822728         216       10864    32559560
> Swap:              0           0           0
> 
> Same config just SMP=n:
> 
>                 total        used        free      shared  buff/cache   available
> Mem:        32885804      326876    32821972         216       11100    32558928
> Swap:              0           0           0
> 
> Delta available = 632 KiB
> 
> I haven't had the time to stare at that in detail, but comparing
> /proc/meminfo for the full boot case above does not immediately give me
> a hint. It's confusing at best...
>

That makes me wonder if the number is affected by the total memory size.
How about a system with 1GB of memory or less ?

Guenter


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-15 16:17   ` [patch 5/9] x86: " Guenter Roeck
  2024-03-15 16:42     ` Linus Torvalds
@ 2024-03-20  8:58     ` Thomas Gleixner
  2024-03-20 15:46       ` Guenter Roeck
  1 sibling, 1 reply; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-20  8:58 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: LKML, x86, Linus Torvalds, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

On Fri, Mar 15 2024 at 09:17, Guenter Roeck wrote:
> I don't know the code well enough to determine what is wrong.
> Please let me know what I can do to help debugging the problem.

Could you provide me the config and the qemu command line?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-20  8:58     ` Thomas Gleixner
@ 2024-03-20 15:46       ` Guenter Roeck
  2024-03-21 11:14         ` Thomas Gleixner
  0 siblings, 1 reply; 41+ messages in thread
From: Guenter Roeck @ 2024-03-20 15:46 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Linus Torvalds, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

On 3/20/24 01:58, Thomas Gleixner wrote:
> On Fri, Mar 15 2024 at 09:17, Guenter Roeck wrote:
>> I don't know the code well enough to determine what is wrong.
>> Please let me know what I can do to help debugging the problem.
> 
> Could you provide me the config and the qemu command line?
> 

defconfig-CONFIG_SMP and

qemu-system-x86_64 -kernel arch/x86/boot/bzImage -cpu Haswell \
      --append "console=ttyS0" -nographic -monitor none

The cpu doesn't really matter as long as it is an Intel CPU.
A root file system isn't needed since the boot doesn't get that far.

Guenter


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-20 15:46       ` Guenter Roeck
@ 2024-03-21 11:14         ` Thomas Gleixner
  2024-03-21 14:06           ` Guenter Roeck
  0 siblings, 1 reply; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-21 11:14 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: LKML, x86, Linus Torvalds, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

On Wed, Mar 20 2024 at 08:46, Guenter Roeck wrote:
> On 3/20/24 01:58, Thomas Gleixner wrote:
>> On Fri, Mar 15 2024 at 09:17, Guenter Roeck wrote:
>>> I don't know the code well enough to determine what is wrong.
>>> Please let me know what I can do to help debugging the problem.
>> 
>> Could you provide me the config and the qemu command line?
>> 
>
> defconfig-CONFIG_SMP and
>
> qemu-system-x86_64 -kernel arch/x86/boot/bzImage -cpu Haswell \
>       --append "console=ttyS0" -nographic -monitor none
>
> The cpu doesn't really matter as long as it is an Intel CPU.
> A root file system isn't needed since the boot doesn't get that far.

Now it get's interesting because I can't reproduce it with that setup at
all.

What's weird is that I saw it exactly once on 64-bit in a VM with a UP
config two days ago, but when I started to add instrumentation it never
came back even after backing the instrumentation changes out. I have
seriously no idea what's going on there.

Is it fully reproducible on your side?

If so can you please provide a full dmesg and then apply the patch below
and provide the resulting full dmesg too?

I found two other issues while trying to find a way to reproduce, but
those are completely unrelated to the problem you are observing.

Thanks,

        tglx
---
 arch/x86/kernel/cpu/topology.c |   19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -176,6 +176,8 @@ static __init void topo_register_apic(u3
 {
 	int cpu, dom;
 
+	pr_info("APIC: %x %d\n", apic_id, present);
+
 	if (present) {
 		set_bit(apic_id, phys_cpu_present_map);
 
@@ -277,10 +279,23 @@ int topology_get_logical_id(u32 apicid,
 	/* Remove the bits below @at_level to get the proper level ID of @apicid */
 	unsigned int lvlid = topo_apicid(apicid, at_level);
 
-	if (lvlid >= MAX_LOCAL_APIC)
+	pr_info("APIC logical ID: %x %x %d\n", apicid, lvlid, at_level);
+
+	if (WARN_ON_ONCE(lvlid >= MAX_LOCAL_APIC))
 		return -ERANGE;
-	if (!test_bit(lvlid, apic_maps[at_level].map))
+
+	/*
+	 * If there was no APIC registered, then the map check below would
+	 * fail. With no APIC this is guaranteed to be an UP system and
+	 * therefore all topology levels have only one entry and their
+	 * logical ID is obviously 0.
+	 */
+	if (topo_info.boot_cpu_apic_id == BAD_APICID)
+		return 0;
+
+	if (WARN_ON_ONCE(!test_bit(lvlid, apic_maps[at_level].map)))
 		return -ENODEV;
+
 	/* Get the number of set bits before @lvlid. */
 	return bitmap_weight(apic_maps[at_level].map, lvlid);
 }

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-21 11:14         ` Thomas Gleixner
@ 2024-03-21 14:06           ` Guenter Roeck
  2024-03-21 16:49             ` Thomas Gleixner
  0 siblings, 1 reply; 41+ messages in thread
From: Guenter Roeck @ 2024-03-21 14:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Linus Torvalds, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

On 3/21/24 04:14, Thomas Gleixner wrote:
> On Wed, Mar 20 2024 at 08:46, Guenter Roeck wrote:
>> On 3/20/24 01:58, Thomas Gleixner wrote:
>>> On Fri, Mar 15 2024 at 09:17, Guenter Roeck wrote:
>>>> I don't know the code well enough to determine what is wrong.
>>>> Please let me know what I can do to help debugging the problem.
>>>
>>> Could you provide me the config and the qemu command line?
>>>
>>
>> defconfig-CONFIG_SMP and
>>
>> qemu-system-x86_64 -kernel arch/x86/boot/bzImage -cpu Haswell \
>>        --append "console=ttyS0" -nographic -monitor none
>>
>> The cpu doesn't really matter as long as it is an Intel CPU.
>> A root file system isn't needed since the boot doesn't get that far.
> 
> Now it get's interesting because I can't reproduce it with that setup at
> all.
> 
> What's weird is that I saw it exactly once on 64-bit in a VM with a UP
> config two days ago, but when I started to add instrumentation it never
> came back even after backing the instrumentation changes out. I have
> seriously no idea what's going on there.
> 
> Is it fully reproducible on your side?
> 

Yes, always.

> If so can you please provide a full dmesg and then apply the patch below
> and provide the resulting full dmesg too?
> 

You'll find everything at http://server.roeck-us.net/qemu/x86-nosmp/

The crash is gone after applying your patch. The difference is:

+       /*
+        * If there was no APIC registered, then the map check below would
+        * fail. With no APIC this is guaranteed to be an UP system and
+        * therefore all topology levels have only one entry and their
+        * logical ID is obviously 0.
+        */
+       if (topo_info.boot_cpu_apic_id == BAD_APICID) {
+               pr_info("#### topo_info.boot_cpu_apic_id == BAD_APICID\n");
                 ^^^^ I added this
+               return 0;
+       }
+

I see the "#### topo_info.boot_cpu_apic_id == BAD_APICID" message
twice in the log. See patched.log at the page pointed to above.

Hope the helps,
Guenter


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 5/9] x86: Cure per CPU madness on UP
  2024-03-21 14:06           ` Guenter Roeck
@ 2024-03-21 16:49             ` Thomas Gleixner
  0 siblings, 0 replies; 41+ messages in thread
From: Thomas Gleixner @ 2024-03-21 16:49 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: LKML, x86, Linus Torvalds, Uros Bizjak, linux-sparse, lkp, oe-kbuild-all

On Thu, Mar 21 2024 at 07:06, Guenter Roeck wrote:
> On 3/21/24 04:14, Thomas Gleixner wrote:
>> If so can you please provide a full dmesg and then apply the patch below
>> and provide the resulting full dmesg too?
>
> You'll find everything at http://server.roeck-us.net/qemu/x86-nosmp/

Thanks for providing this.

> The crash is gone after applying your patch. The difference is:
>
> +       /*
> +        * If there was no APIC registered, then the map check below would
> +        * fail. With no APIC this is guaranteed to be an UP system and
> +        * therefore all topology levels have only one entry and their
> +        * logical ID is obviously 0.
> +        */
> +       if (topo_info.boot_cpu_apic_id == BAD_APICID) {
> +               pr_info("#### topo_info.boot_cpu_apic_id == BAD_APICID\n");
>                  ^^^^ I added this
> +               return 0;
> +       }
> +
>
> I see the "#### topo_info.boot_cpu_apic_id == BAD_APICID" message
> twice in the log. See patched.log at the page pointed to above.

I can see why this is emitted. That happens on the initial CPUID
evaluation of the boot CPU very early during boot.

[    0.000000] Command line: console=ttyS0
[    0.000000] CPU topo: APIC logical ID: 0 0 6
[    0.000000] CPU topo: #### topo_info.boot_cpu_apic_id == BAD_APICID
[    0.000000] CPU topo: APIC logical ID: 0 0 4
[    0.000000] CPU topo: #### topo_info.boot_cpu_apic_id == BAD_APICID

The later full CPUID evaluation happens after the ACPI enumeration and
way before the affected RAPL driver is initialized:

[    0.088029] CPU topo: APIC logical ID: 0 0 6
[    0.088084] CPU topo: APIC logical ID: 0 0 4

This invocation has the boot APIC registered as your extra print does
not show up.

...

[    0.585850] RAPL PMU: API unit is 2^-32 Joules, 0 fixed counters, 10737418240 ms ovfl timer

So even without that guard (which we need anyway for the non APIC case)
topology_logical_die_id() == cpu_data(cpu).topo.logical_die_id must have
the correct value in that RAPL initialization and CPU hotplug callback
code.

But our absolutely convoluted startup logic prevents that because:

    identify_cpu_early()      operates on boot_cpu_data
    smp_prepare_boot_cpu()    copies boot_cpu_data to per CPU cpu data
    identify_boot_cpu()       operates on boot_cpu_data

identify_boot_cpu() is the one which gets the correct logical die info,
but that never gets copied over to the per CPU data instance on which
the RAPL code and everything else works on.

I'll cook up a patch later.

Thanks for providing the info!

       tglx


^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2024-03-21 16:49 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-04 10:12 [patch 0/9] x86: Cure tons of sparse warnings (mostly __percpu) Thomas Gleixner
2024-03-04 10:12 ` [patch 1/9] perf/x86/amd/uncore: Fix __percpu annotation Thomas Gleixner
2024-03-04 12:15   ` [tip: x86/cleanups] " tip-bot2 for Thomas Gleixner
2024-03-04 10:12 ` [patch 2/9] x86/msr: Prepare for including percpu.h Thomas Gleixner
2024-03-04 12:15   ` [tip: x86/cleanups] x86/msr: Prepare for including <linux/percpu.h> into <asm/msr.h> tip-bot2 for Thomas Gleixner
2024-03-04 10:12 ` [patch 3/9] x86/msr: Add missing __percpu annotations Thomas Gleixner
2024-03-04 12:15   ` [tip: x86/cleanups] " tip-bot2 for Thomas Gleixner
2024-03-04 10:12 ` [patch 4/9] smp: Consolidate smp_prepare_boot_cpu() Thomas Gleixner
2024-03-04 12:15   ` [tip: x86/cleanups] " tip-bot2 for Thomas Gleixner
2024-03-04 10:12 ` [patch 5/9] x86: Cure per CPU madness on UP Thomas Gleixner
2024-03-04 12:15   ` [tip: x86/cleanups] x86/percpu: " tip-bot2 for Thomas Gleixner
2024-03-15 16:17   ` [patch 5/9] x86: " Guenter Roeck
2024-03-15 16:42     ` Linus Torvalds
2024-03-15 17:02       ` Guenter Roeck
2024-03-15 17:40       ` Thomas Gleixner
2024-03-15 22:55         ` Thomas Gleixner
2024-03-15 23:23           ` Linus Torvalds
2024-03-16  1:11             ` Thomas Gleixner
2024-03-16  1:23               ` Linus Torvalds
2024-03-16 21:34                 ` Arnd Bergmann
2024-03-17 21:03               ` David Laight
2024-03-18 11:11               ` Thomas Gleixner
2024-03-18 17:27               ` Thomas Gleixner
2024-03-18 19:13                 ` Arnd Bergmann
2024-03-19 16:21                   ` Thomas Gleixner
2024-03-19 18:26                     ` Guenter Roeck
2024-03-16  0:56           ` Guenter Roeck
2024-03-20  8:58     ` Thomas Gleixner
2024-03-20 15:46       ` Guenter Roeck
2024-03-21 11:14         ` Thomas Gleixner
2024-03-21 14:06           ` Guenter Roeck
2024-03-21 16:49             ` Thomas Gleixner
2024-03-04 10:12 ` [patch 6/9] x86/uaccess: Add missing __force to casts in __access_ok() and valid_user_address() Thomas Gleixner
2024-03-04 12:15   ` [tip: x86/cleanups] " tip-bot2 for Thomas Gleixner
2024-03-04 10:12 ` [patch 7/9] x86/cpu: Use EXPORT_PER_CPU_SYMBOL_GPL() for x86_spec_ctrl_current Thomas Gleixner
2024-03-04 12:15   ` [tip: x86/cleanups] " tip-bot2 for Thomas Gleixner
2024-03-04 10:12 ` [patch 8/9] x86/cpu: Provide a declaration for itlb_multihit_kvm_mitigation Thomas Gleixner
2024-03-04 12:15   ` [tip: x86/cleanups] " tip-bot2 for Thomas Gleixner
2024-03-04 10:12 ` [patch 9/9] x86/callthunks: Use EXPORT_PER_CPU_SYMBOL_GPL() for per CPU variables Thomas Gleixner
2024-03-04 12:15   ` [tip: x86/cleanups] " tip-bot2 for Thomas Gleixner
2024-03-04 11:08 ` [patch 0/9] x86: Cure tons of sparse warnings (mostly __percpu) Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).