From: Brian Gerst <brgerst@gmail.com>
To: Usama Arif <usama.arif@bytedance.com>
Cc: dwmw2@infradead.org, tglx@linutronix.de, kim.phillips@amd.com,
piotrgorski@cachyos.org, oleksandr@natalenko.name,
arjan@linux.intel.com, mingo@redhat.com, bp@alien8.de,
dave.hansen@linux.intel.com, hpa@zytor.com, x86@kernel.org,
pbonzini@redhat.com, paulmck@kernel.org,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
rcu@vger.kernel.org, mimoja@mimoja.de, hewenliang4@huawei.com,
thomas.lendacky@amd.com, seanjc@google.com,
pmenzel@molgen.mpg.de, fam.zheng@bytedance.com,
punit.agrawal@bytedance.com, simon.evans@bytedance.com,
liangma@liangbit.com, David Woodhouse <dwmw@amazon.co.uk>,
"Guilherme G . Piccoli" <gpiccoli@igalia.com>
Subject: Re: [PATCH v14 09/12] x86/smpboot: Support parallel startup of secondary CPUs
Date: Mon, 13 Mar 2023 18:21:29 -0400 [thread overview]
Message-ID: <CAMzpN2iz4UzQh7YGOo6vGB2qmpogiCUCTfYpvdeP=_Dm=66BwA@mail.gmail.com> (raw)
In-Reply-To: <20230308171328.1562857-10-usama.arif@bytedance.com>
On Wed, Mar 8, 2023 at 12:13 PM Usama Arif <usama.arif@bytedance.com> wrote:
>
> From: David Woodhouse <dwmw@amazon.co.uk>
>
> Rework the real-mode startup code to allow for APs to be brought up in
> parallel. This is in two parts:
>
> 1. Introduce a bit-spinlock to prevent them from all using the real
> mode stack at the same time.
>
> 2. Avoid needing to use the global smpboot_control variable to pass
> each AP its CPU#.
>
> To achieve the latter, export the cpuid_to_apicid[] array so that each
> AP can find its own CPU# by searching therein based on its APIC ID.
>
> Introduce flags in the top bits of smpboot_control which indicate methods
> by which an AP should find its CPU#. For a serialized bringup, the CPU#
> is explicitly passed in the low bits of smpboot_control as before. For
> parallel mode there are flags directing the AP to find its APIC ID in
> CPUID leaf 0x0b (for X2APIC mode) or CPUID leaf 0x01 where 8 bits are
> sufficient, then perform the cpuid_to_apicid[] lookup with that.
>
> Parallel startup may be disabled by a command line option, and also if:
> • AMD SEV-ES is in use, since the AP may not use CPUID that early.
> • X2APIC is enabled, but CPUID leaf 0xb is not present and correct.
> • X2APIC is not enabled but not even CPUID leaf 0x01 exists.
>
> Aside from the fact that APs will now look up their CPU# via the
> newly-exported cpuid_to_apicid[] table, there is no behavioural change
> intended yet, since new parallel CPUHP states have not — yet — been
> added.
>
> [ tglx: Initial proof of concept patch with bitlock and APIC ID lookup ]
> [ dwmw2: Rework and testing, commit message, CPUID 0x1 and CPU0 support ]
> [ seanc: Fix stray override of initial_gs in common_cpu_up() ]
> [ Oleksandr Natalenko: reported suspend/resume issue fixed in
> x86_acpi_suspend_lowlevel ]
> Co-developed-by: Thomas Gleixner <tglx@linutronix.de>
> Co-developed-by: Brian Gerst <brgerst@gmail.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Brian Gerst <brgerst@gmail.com>
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> Signed-off-by: Usama Arif <usama.arif@bytedance.com>
> Tested-by: Paul E. McKenney <paulmck@kernel.org>
> Tested-by: Kim Phillips <kim.phillips@amd.com>
> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
> Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
> ---
> .../admin-guide/kernel-parameters.txt | 3 +
> arch/x86/include/asm/cpu.h | 1 +
> arch/x86/include/asm/realmode.h | 3 +
> arch/x86/include/asm/smp.h | 6 ++
> arch/x86/kernel/acpi/sleep.c | 9 ++-
> arch/x86/kernel/apic/apic.c | 2 +-
> arch/x86/kernel/cpu/topology.c | 2 +-
> arch/x86/kernel/head_64.S | 64 +++++++++++++++++++
> arch/x86/kernel/smpboot.c | 50 ++++++++++++++-
> arch/x86/realmode/init.c | 3 +
> arch/x86/realmode/rm/trampoline_64.S | 27 ++++++--
> 11 files changed, 161 insertions(+), 9 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 6cfa6e3996cf..7bb7020f97e2 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -3819,6 +3819,9 @@
>
> nomodule Disable module load
>
> + no_parallel_bringup
> + [X86,SMP] Disable parallel bring-up of secondary cores.
> +
> nopat [X86] Disable PAT (page attribute table extension of
> pagetables) support.
>
> diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
> index 78796b98a544..ef8ba318dca1 100644
> --- a/arch/x86/include/asm/cpu.h
> +++ b/arch/x86/include/asm/cpu.h
> @@ -97,5 +97,6 @@ static inline bool intel_cpu_signatures_match(unsigned int s1, unsigned int p1,
> extern u64 x86_read_arch_cap_msr(void);
> int intel_find_matching_signature(void *mc, unsigned int csig, int cpf);
> int intel_microcode_sanity_check(void *mc, bool print_err, int hdr_type);
> +int check_extended_topology_leaf(int leaf);
>
> #endif /* _ASM_X86_CPU_H */
> diff --git a/arch/x86/include/asm/realmode.h b/arch/x86/include/asm/realmode.h
> index f6a1737c77be..87e5482acd0d 100644
> --- a/arch/x86/include/asm/realmode.h
> +++ b/arch/x86/include/asm/realmode.h
> @@ -52,6 +52,7 @@ struct trampoline_header {
> u64 efer;
> u32 cr4;
> u32 flags;
> + u32 lock;
> #endif
> };
>
> @@ -64,6 +65,8 @@ extern unsigned long initial_stack;
> extern unsigned long initial_vc_handler;
> #endif
>
> +extern u32 *trampoline_lock;
> +
> extern unsigned char real_mode_blob[];
> extern unsigned char real_mode_relocs[];
>
> diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
> index bf2c51df9e0b..1cf4f1e57570 100644
> --- a/arch/x86/include/asm/smp.h
> +++ b/arch/x86/include/asm/smp.h
> @@ -203,4 +203,10 @@ extern unsigned int smpboot_control;
>
> #endif /* !__ASSEMBLY__ */
>
> +/* Control bits for startup_64 */
> +#define STARTUP_APICID_CPUID_0B 0x80000000
> +#define STARTUP_APICID_CPUID_01 0x40000000
> +
> +#define STARTUP_PARALLEL_MASK (STARTUP_APICID_CPUID_01 | STARTUP_APICID_CPUID_0B)
> +
> #endif /* _ASM_X86_SMP_H */
> diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c
> index 1328c221af30..6dfecb27b846 100644
> --- a/arch/x86/kernel/acpi/sleep.c
> +++ b/arch/x86/kernel/acpi/sleep.c
> @@ -16,6 +16,7 @@
> #include <asm/cacheflush.h>
> #include <asm/realmode.h>
> #include <asm/hypervisor.h>
> +#include <asm/smp.h>
>
> #include <linux/ftrace.h>
> #include "../../realmode/rm/wakeup.h"
> @@ -127,7 +128,13 @@ int x86_acpi_suspend_lowlevel(void)
> * value is in the actual %rsp register.
> */
> current->thread.sp = (unsigned long)temp_stack + sizeof(temp_stack);
> - smpboot_control = smp_processor_id();
> + /*
> + * Ensure the CPU knows which one it is when it comes back, if
> + * it isn't in parallel mode and expected to work that out for
> + * itself.
> + */
> + if (!(smpboot_control & STARTUP_PARALLEL_MASK))
> + smpboot_control = smp_processor_id();
> #endif
> initial_code = (unsigned long)wakeup_long64;
> saved_magic = 0x123456789abcdef0L;
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index 20d9a604da7c..ac1d7e5da1f2 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -2377,7 +2377,7 @@ static int nr_logical_cpuids = 1;
> /*
> * Used to store mapping between logical CPU IDs and APIC IDs.
> */
> -static int cpuid_to_apicid[] = {
> +int cpuid_to_apicid[] = {
> [0 ... NR_CPUS - 1] = -1,
> };
>
> diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
> index 5e868b62a7c4..1dc20f4dfa6e 100644
> --- a/arch/x86/kernel/cpu/topology.c
> +++ b/arch/x86/kernel/cpu/topology.c
> @@ -32,7 +32,7 @@ EXPORT_SYMBOL(__max_die_per_package);
> /*
> * Check if given CPUID extended topology "leaf" is implemented
> */
> -static int check_extended_topology_leaf(int leaf)
> +int check_extended_topology_leaf(int leaf)
> {
> unsigned int eax, ebx, ecx, edx;
>
> diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> index 6a8238702eab..c35f7c173832 100644
> --- a/arch/x86/kernel/head_64.S
> +++ b/arch/x86/kernel/head_64.S
> @@ -25,6 +25,7 @@
> #include <asm/export.h>
> #include <asm/nospec-branch.h>
> #include <asm/fixmap.h>
> +#include <asm/smp.h>
>
> /*
> * We are not able to switch in one step to the final KERNEL ADDRESS SPACE
> @@ -234,8 +235,61 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
> ANNOTATE_NOENDBR // above
>
> #ifdef CONFIG_SMP
> + /*
> + * For parallel boot, the APIC ID is retrieved from CPUID, and then
> + * used to look up the CPU number. For booting a single CPU, the
> + * CPU number is encoded in smpboot_control.
> + *
> + * Bit 31 STARTUP_APICID_CPUID_0B flag (use CPUID 0x0b)
> + * Bit 30 STARTUP_APICID_CPUID_01 flag (use CPUID 0x01)
> + * Bit 0-24 CPU# if STARTUP_APICID_CPUID_xx flags are not set
> + */
> movl smpboot_control(%rip), %ecx
> + testl $STARTUP_APICID_CPUID_0B, %ecx
> + jnz .Luse_cpuid_0b
> + testl $STARTUP_APICID_CPUID_01, %ecx
> + jnz .Luse_cpuid_01
> + andl $0x0FFFFFFF, %ecx
> + jmp .Lsetup_cpu
> +
> +.Luse_cpuid_01:
> + mov $0x01, %eax
> + cpuid
> + mov %ebx, %edx
> + shr $24, %edx
> + jmp .Lsetup_AP
>
> +.Luse_cpuid_0b:
> + mov $0x0B, %eax
> + xorl %ecx, %ecx
> + cpuid
> +
> +.Lsetup_AP:
> + /* EDX contains the APIC ID of the current CPU */
> + xorq %rcx, %rcx
> + leaq cpuid_to_apicid(%rip), %rbx
> +
> +.Lfind_cpunr:
> + cmpl (%rbx,%rcx,4), %edx
> + jz .Lsetup_cpu
> + inc %ecx
> +#ifdef CONFIG_FORCE_NR_CPUS
> + cmpl $NR_CPUS, %ecx
> +#else
> + cmpl nr_cpu_ids(%rip), %ecx
> +#endif
> + jb .Lfind_cpunr
> +
> + /* APIC ID not found in the table. Drop the trampoline lock and bail. */
> + movq trampoline_lock(%rip), %rax
> + lock
> + btrl $0, (%rax)
> +
> +1: cli
> + hlt
> + jmp 1b
> +
> +.Lsetup_cpu:
> /* Get the per cpu offset for the given CPU# which is in ECX */
> movq __per_cpu_offset(,%rcx,8), %rdx
> #else
> @@ -293,6 +347,14 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
> shrq $32, %rdx
> wrmsr
>
> + /* Drop the realmode protection. For the boot CPU the pointer is NULL! */
> + movq trampoline_lock(%rip), %rax
> + testq %rax, %rax
> + jz .Lsetup_idt
> + lock
> + btrl $0, (%rax)
> +
> +.Lsetup_idt:
Now that the lock only protects the realmode stack, releasing it can
happen right after the switch to the idle task stack..
--
Brian Gerst
next prev parent reply other threads:[~2023-03-13 22:21 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-08 17:13 [PATCH v14 00/12] Parallel CPU bringup for x86_64 Usama Arif
2023-03-08 17:13 ` [PATCH v14 01/12] x86/apic/x2apic: Allow CPU cluster_mask to be populated in parallel Usama Arif
2023-03-08 17:13 ` [PATCH v14 02/12] cpu/hotplug: Move idle_thread_get() to <linux/smpboot.h> Usama Arif
2023-03-08 17:13 ` [PATCH v14 03/12] cpu/hotplug: Add dynamic parallel bringup states before CPUHP_BRINGUP_CPU Usama Arif
2023-03-08 17:13 ` [PATCH v14 04/12] x86/smpboot: Reference count on smpboot_setup_warm_reset_vector() Usama Arif
2023-03-08 17:13 ` [PATCH v14 05/12] x86/smpboot: Split up native_cpu_up into separate phases and document them Usama Arif
2023-03-11 9:19 ` Thomas Gleixner
2023-03-11 10:21 ` David Woodhouse
2023-03-08 17:13 ` [PATCH v14 06/12] x86/smpboot: Remove initial_stack on 64-bit Usama Arif
2023-03-08 17:13 ` [PATCH v14 07/12] x86/smpboot: Remove early_gdt_descr " Usama Arif
2023-03-08 17:13 ` [PATCH v14 08/12] x86/smpboot: Remove initial_gs Usama Arif
2023-03-08 17:13 ` [PATCH v14 09/12] x86/smpboot: Support parallel startup of secondary CPUs Usama Arif
2023-03-13 22:21 ` Brian Gerst [this message]
2023-03-08 17:13 ` [PATCH v14 10/12] x86/smpboot: Send INIT/SIPI/SIPI to secondary CPUs in parallel Usama Arif
2023-03-11 8:54 ` Thomas Gleixner
2023-03-11 9:55 ` David Woodhouse
2023-03-11 14:14 ` Thomas Gleixner
2023-03-11 14:25 ` David Woodhouse
2023-03-11 10:17 ` David Woodhouse
2023-03-08 17:13 ` [PATCH v14 11/12] x86/smpboot: Serialize topology updates for secondary bringup Usama Arif
2023-03-08 17:13 ` [PATCH v14 12/12] x86/smpboot: Allow parallel bringup for SEV-ES Usama Arif
2023-03-10 19:20 ` [PATCH v14 00/12] Parallel CPU bringup for x86_64 Tor Vic
2023-03-10 20:18 ` Paul Menzel
2023-03-11 19:23 ` Tor Vic
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAMzpN2iz4UzQh7YGOo6vGB2qmpogiCUCTfYpvdeP=_Dm=66BwA@mail.gmail.com' \
--to=brgerst@gmail.com \
--cc=arjan@linux.intel.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=dwmw2@infradead.org \
--cc=dwmw@amazon.co.uk \
--cc=fam.zheng@bytedance.com \
--cc=gpiccoli@igalia.com \
--cc=hewenliang4@huawei.com \
--cc=hpa@zytor.com \
--cc=kim.phillips@amd.com \
--cc=kvm@vger.kernel.org \
--cc=liangma@liangbit.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mimoja@mimoja.de \
--cc=mingo@redhat.com \
--cc=oleksandr@natalenko.name \
--cc=paulmck@kernel.org \
--cc=pbonzini@redhat.com \
--cc=piotrgorski@cachyos.org \
--cc=pmenzel@molgen.mpg.de \
--cc=punit.agrawal@bytedance.com \
--cc=rcu@vger.kernel.org \
--cc=seanjc@google.com \
--cc=simon.evans@bytedance.com \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=usama.arif@bytedance.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).