All of lore.kernel.org
 help / color / mirror / Atom feed
From: Uros Bizjak <ubizjak@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
	Andy Lutomirski <luto@kernel.org>, Ingo Molnar <mingo@kernel.org>,
	Nadav Amit <namit@vmware.com>, Brian Gerst <brgerst@gmail.com>,
	Denys Vlasenko <dvlasenk@redhat.com>,
	"H . Peter Anvin" <hpa@zytor.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>,
	Josh Poimboeuf <jpoimboe@redhat.com>
Subject: Re: [PATCH 4/4] x86/percpu: Use C for percpu read/write accessors
Date: Sun, 8 Oct 2023 21:17:48 +0200	[thread overview]
Message-ID: <CAFULd4YWjxoSTyCtMN0OzKgHtshMQOuMH1Z0n_OaWKVnUjy2iA@mail.gmail.com> (raw)
In-Reply-To: <CAHk-=wgepFm=jGodFQYPAaEvcBhR3-f_h1BLBYiVQsutCwCnUQ@mail.gmail.com>

On Sun, Oct 8, 2023 at 8:00 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Wed, 4 Oct 2023 at 07:51, Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > The percpu code mostly uses inline assembly. Using segment qualifiers
> > allows to use C code instead, which enables the compiler to perform
> > various optimizations (e.g. propagation of memory arguments). Convert
> > percpu read and write accessors to C code, so the memory argument can
> > be propagated to the instruction that uses this argument.
>
> So apparently this causes boot failures.
>
> It might be worth testing a version where this:
>
> > +#define raw_cpu_read_1(pcp)            __raw_cpu_read(, pcp)
> > +#define raw_cpu_read_2(pcp)            __raw_cpu_read(, pcp)
> > +#define raw_cpu_read_4(pcp)            __raw_cpu_read(, pcp)
> > +#define raw_cpu_write_1(pcp, val)      __raw_cpu_write(, pcp, val)
> > +#define raw_cpu_write_2(pcp, val)      __raw_cpu_write(, pcp, val)
> > +#define raw_cpu_write_4(pcp, val)      __raw_cpu_write(, pcp, val)
>
> and this
>
> > +#ifdef CONFIG_X86_64
> > +#define raw_cpu_read_8(pcp)            __raw_cpu_read(, pcp)
> > +#define raw_cpu_write_8(pcp, val)      __raw_cpu_write(, pcp, val)
>
> was all using 'volatile' in the qualifier argument and see if that
> makes the boot failure go away.
>
> Because while the old code wasn't "asm volatile", even just a *plain*
> asm() is certainly a lot more serialized than a normal access.
>
> For example, the asm() version of raw_cpu_write() used "+m" for the
> destination modifier, which means that if you did multiple percpu
> writes to the same variable, gcc would output multiple asm calls,
> because it would see the subsequent ones as reading the old value
> (even if they don't *actually* do so).
>
> That's admittedly really just because it uses a common macro for
> raw_cpu_write() and the updates (like the percpu_add() code), so the
> fact that it uses "+m" instead of "=m" is just a random odd artifact
> of the inline asm version, but maybe we have code that ends up working
> just by accident.
>
> Also, I'm not sure gcc re-orders asms wrt each other - even when they
> aren't marked volatile.
>
> So it might be worth at least a trivial "make everything volatile"
> test to see if that affects anything.

I have managed to reproduce the bug, and I'm trying the following path:

Scrap the last patch and just add:

#define __raw_cpu_read_new(size, qual, pcp)                \
({                                    \
    *(qual __my_cpu_type(pcp) *)__my_cpu_ptr(&(pcp));        \
})

#define __raw_cpu_read(size, qual, _var)                \
({                                    \
    __pcpu_type_##size pfo_val__;                    \
    asm qual (__pcpu_op2_##size("mov", __percpu_arg([var]), "%[val]") \
        : [val] __pcpu_reg_##size("=", pfo_val__)            \
        : [var] "m" (__my_cpu_var(_var)));                \
    (typeof(_var))(unsigned long) pfo_val__;            \
})

Then changing *only*

#define raw_cpu_read_8(pcp)            __raw_cpu_read_new(8, , pcp)

brings the boot process far enough to report:

[    0.463711][    T0] Dentry cache hash table entries: 1048576
(order: 11, 8388608 bytes, linear)
[    0.465859][    T0] Inode-cache hash table entries: 524288 (order:
10, 4194304 bytes, linear)
PANIC: early exception 0x0d IP 10:ffffffff810c4cb9 error 0 cr2
0xffff8881ab1ff000
[    0.469084][    T0] CPU: 0 PID: 0 Comm: swapper Not tainted
6.5.0-11417-gca4256348660-dirty #7
[    0.470756][    T0] RIP: 0010:cpu_init_exception_handling+0x179/0x740
[    0.472045][    T0] Code: be 0f 00 00 00 4a 03 04 ed 40 19 15 85 48
89 c7 e8 9c bb ff ff 48 c7 c0 10 73 02 00 48 ba 00 00 00 00 00 fc ff
df 48 c1 e8 03 <80> 3c 10
00 0f 85 21 05 00 00 65 48 8b 05 45 26 f6 7e 48 8d 7b 24
[    0.475784][    T0] RSP: 0000:ffffffff85207e38 EFLAGS: 00010002
ORIG_RAX: 0000000000000000
[    0.477384][    T0] RAX: 0000000000004e62 RBX: ffff88817700a000
RCX: 0000000000000010
[    0.479093][    T0] RDX: dffffc0000000000 RSI: ffffffff85207e60
RDI: ffff88817700f078
[    0.481178][    T0] RBP: 000000000000f000 R08: 0040f50000000000
R09: 0040f50000000000
[    0.482655][    T0] R10: ffff8881ab02a000 R11: 0000000000000000
R12: 1ffffffff0a40fc8
[    0.484128][    T0] R13: 0000000000000000 R14: 0000000000000000
R15: ffffffff85151940
[    0.485604][    T0] FS:  0000000000000000(0000)
GS:ffff888177000000(0000) knlGS:0000000000000000
[    0.487246][    T0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.488515][    T0] CR2: ffff8881ab1ff000 CR3: 00000000052d7000
CR4: 00000000000000b0
[    0.490002][    T0] Call Trace:
[    0.490600][    T0]  <TASK>
[    0.491145][    T0]  ? early_fixup_exception+0x10e/0x280
[    0.492176][    T0]  ? early_idt_handler_common+0x2f/0x40
[    0.493222][    T0]  ? cpu_init_exception_handling+0x179/0x740
[    0.494348][    T0]  ? cpu_init_exception_handling+0x164/0x740
[    0.495472][    T0]  ? syscall_init+0x1c0/0x1c0
[    0.496351][    T0]  ? per_cpu_ptr_to_phys+0x1ca/0x2c0
[    0.497336][    T0]  ? setup_cpu_entry_areas+0x138/0x980
[    0.498365][    T0]  trap_init+0xa/0x40

Let me see what happens here. I have changed *only* raw_cpu_read_8,
but the GP fault is reported in cpu_init_exception_handling, which
uses this_cpu_ptr. Please note that all per-cpu initializations go
through existing {this|raw}_cpu_write.

void cpu_init_exception_handling(void)
{
struct tss_struct *tss = this_cpu_ptr(&cpu_tss_rw);
int cpu = raw_smp_processor_id();
...

I have tried the trick with all reads volatile (and writes as they
were before the patch), but it didn't make a difference. Also, the
kernel from the git/tip branch works OK for default config, so I think
there is some config option that somehow disturbs the named-as enabled
kernel.

Uros.

  reply	other threads:[~2023-10-08 19:18 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-04 14:49 [PATCH 0/4] x86/percpu: Use segment qualifiers Uros Bizjak
2023-10-04 14:49 ` [PATCH 1/4] x86/percpu: Update arch/x86/include/asm/percpu.h to the current tip Uros Bizjak
2023-10-04 14:49 ` [PATCH 2/4] x86/percpu: Enable named address spaces with known compiler version Uros Bizjak
2023-10-05  7:20   ` [tip: x86/percpu] " tip-bot2 for Uros Bizjak
2023-10-04 14:49 ` [PATCH 3/4] x86/percpu: Use compiler segment prefix qualifier Uros Bizjak
2023-10-05  7:20   ` [tip: x86/percpu] " tip-bot2 for Nadav Amit
2023-10-04 14:49 ` [PATCH 4/4] x86/percpu: Use C for percpu read/write accessors Uros Bizjak
2023-10-04 16:37   ` Ingo Molnar
2023-10-04 16:40     ` Ingo Molnar
2023-10-04 19:23     ` [PATCH v2 " Uros Bizjak
2023-10-04 19:42       ` Linus Torvalds
2023-10-04 20:07         ` Uros Bizjak
2023-10-04 20:12           ` Linus Torvalds
2023-10-04 20:19             ` Linus Torvalds
2023-10-04 20:22               ` Uros Bizjak
2023-10-05  7:06       ` Ingo Molnar
2023-10-05  7:40         ` Uros Bizjak
2023-10-05  7:20       ` [tip: x86/percpu] " tip-bot2 for Uros Bizjak
2023-10-08 17:59   ` [PATCH 4/4] " Linus Torvalds
2023-10-08 19:17     ` Uros Bizjak [this message]
2023-10-08 20:13       ` Linus Torvalds
2023-10-08 20:48         ` Linus Torvalds
2023-10-08 21:41           ` Uros Bizjak
2023-10-09 11:41             ` Ingo Molnar
2023-10-09 11:51               ` Ingo Molnar
2023-10-09 12:00                 ` Uros Bizjak
2023-10-09 12:20                   ` Ingo Molnar
2023-10-09 12:21                   ` Nadav Amit
2023-10-09 12:42                     ` Uros Bizjak
2023-10-09 12:53                       ` Nadav Amit
2023-10-09 12:27               ` Uros Bizjak
2023-10-09 14:35               ` Uros Bizjak
2024-04-10 11:11                 ` Andrey Konovalov
2024-04-10 11:21                   ` Uros Bizjak
2024-04-10 11:24                     ` Andrey Konovalov
2023-10-09 11:42       ` Ingo Molnar
2023-10-10  6:37     ` Uros Bizjak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFULd4YWjxoSTyCtMN0OzKgHtshMQOuMH1Z0n_OaWKVnUjy2iA@mail.gmail.com \
    --to=ubizjak@gmail.com \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=dvlasenk@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=namit@vmware.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.