On Wed, Oct 18, 2023 at 2:14 PM Uros Bizjak wrote: > > On Wed, Oct 18, 2023 at 12:54 PM Nadav Amit wrote: > > > > > > > > > On Oct 18, 2023, at 12:04 PM, Uros Bizjak wrote: > > > > > > Solved. > > > > > > All that is needed is to patch cpu_init() from > > > arch/x86/kernel/cpu/common.c with: > > > > > > --cut here-- > > > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c > > > index b14fc8c1c953..61b6fcdf6937 100644 > > > --- a/arch/x86/kernel/cpu/common.c > > > +++ b/arch/x86/kernel/cpu/common.c > > > @@ -2228,7 +2232,7 @@ void cpu_init_exception_handling(void) > > > */ > > > void cpu_init(void) > > > { > > > - struct task_struct *cur = current; > > > + struct task_struct *cur = this_cpu_read_stable(pcpu_hot.current_task); > > > int cpu = raw_smp_processor_id(); > > > > Thanks for solving that, and sorry that I missed it. > > > > The reason I didn’t encounter it before is that in my original patch I created > > a new compilation unit which only defined the alias. > > > > Since there might be additional problems (any “current” use in common.c is > > dangerous, even in included files), I think that while there may be additional > > solutions, defining the alias in a separate compilation unit - as I did before - > > is the safest. > > What happens here can be illustrated with the following testcase: > > --cut here-- > int init_mm; > > struct task_struct > { > int *active_mm; > }; > > struct task_struct init_task; > > struct pcpu_hot > { > struct task_struct *current_task; > }; > > struct pcpu_hot pcpu_hot = { .current_task = &init_task }; > > extern const struct pcpu_hot __seg_gs const_pcpu_hot > __attribute__((alias("pcpu_hot"))); > > void foo (void) > { > struct task_struct *cur = const_pcpu_hot.current_task; > > cur->active_mm = &init_mm; > } > --cut here-- > > gcc -O2 -S: > > foo: > movq $init_mm, init_task(%rip) > ret > > Here, gcc optimizes the access to generic address space, which is > allowed to, since *we set the alias to pcpu_hot*, which is in the > generic address space. The compiler doesn't care that we actually > want: > > foo: > movq %gs:const_pcpu_hot(%rip), %rax > movq $init_mm, (%rax) > > So yes, to prevent the optimization, we have to hide the alias in another TU. > > BTW: Clang creates: > > foo: > movq %gs:pcpu_hot(%rip), %rax > movq $init_mm, (%rax) > retq > > It is a bit more conservative and retains the address space of the > aliasing symbol. > > Looks like another case of underspecified functionality where both > compilers differ. Luckily, both DTRT when aliases are hidden in > another TU. Attached is the prototype patch that works for me (together with Linus' FPU switching patch). Uros.