linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] Avoid booting stall caused by
@ 2021-01-13  1:40 Jia He
  2021-01-13  1:40 ` [RFC PATCH 1/2] arm64/cpuinfo: Move init_cpu_features() ahead of setup.c::early_fixmap_init() Jia He
  2021-01-13  1:40 ` [RFC PATCH 2/2] arm64: kpti: Update arm64_use_ng_mappings before pagetable mapping Jia He
  0 siblings, 2 replies; 6+ messages in thread
From: Jia He @ 2021-01-13  1:40 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel
  Cc: Anshuman Khandual, Suzuki K Poulose, Jia He, Mark Rutland,
	Gustavo A. R. Silva, Richard Henderson, Dave Martin,
	Steven Price, Andrew Morton, Mike Rapoport, Ard Biesheuvel,
	Gavin Shan, Kefeng Wang, Mark Brown, Marc Zyngier,
	Cristian Marussi

There is a 10s stall in idmap_kpti_install_ng_mappings when kernel boots
on a Ampere EMAG server.

Commit f992b4dfd58b ("arm64: kpti: Add ->enable callback to remap
swapper using nG mappings") updates the nG bit runtime if kpti is
required.

But things get worse if rodata=full in map_mem(). NO_BLOCK_MAPPINGS |
NO_CONT_MAPPINGS is required when creating pagetable mapping. Hence all
ptes are fully mapped in this case. On a Ampere EMAG server with 256G
memory(pagesize=4k), it causes the 10s stall.

After moving init_cpu_features() ahead of early_fixmap_init(), we can use
cpu_have_const_cap earlier than before. Hence we can avoid this stall
by updating arm64_use_ng_mappings.

After this patch series, it reduces the kernel boot time from 14.7s to
4.1s:
Before:
[   14.757569] Freeing initrd memory: 60752K
After:
[    4.138819] Freeing initrd memory: 60752K

Set it as RFC because I want to resolve any other points which I have
misconerned.

Jia He (2):
  arm64/cpuinfo: Move init_cpu_features() ahead of early_fixmap_init()
  arm64: kpti: Update arm64_use_ng_mappings before pagetable mapping

 arch/arm64/include/asm/cpu.h |  1 +
 arch/arm64/kernel/cpuinfo.c  | 13 ++++++++++---
 arch/arm64/kernel/setup.c    | 18 +++++++++++++-----
 arch/arm64/kernel/smp.c      |  3 +--
 4 files changed, 25 insertions(+), 10 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [RFC PATCH 1/2] arm64/cpuinfo: Move init_cpu_features() ahead of setup.c::early_fixmap_init()
  2021-01-13  1:40 [RFC PATCH 0/2] Avoid booting stall caused by Jia He
@ 2021-01-13  1:40 ` Jia He
  2021-01-26 13:57   ` Will Deacon
  2021-01-13  1:40 ` [RFC PATCH 2/2] arm64: kpti: Update arm64_use_ng_mappings before pagetable mapping Jia He
  1 sibling, 1 reply; 6+ messages in thread
From: Jia He @ 2021-01-13  1:40 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel
  Cc: Anshuman Khandual, Suzuki K Poulose, Jia He, Mark Rutland,
	Gustavo A. R. Silva, Richard Henderson, Dave Martin,
	Steven Price, Andrew Morton, Mike Rapoport, Ard Biesheuvel,
	Gavin Shan, Kefeng Wang, Mark Brown, Marc Zyngier,
	Cristian Marussi

Move init_cpu_features() ahead of setup_arch()->early_fixmap_init(), which
is the preparation work for checking the condition to assign
arm64_use_ng_mappings as cpus_have_const_cap(ARM64_UNMAP_KERNEL_AT_EL0).

Besides, jump_label_init() is also moved ahead because
cpus_have_const_cap() depends on static key enable api.

Percpu helpers should be avoided in cpuinfo_store_boot_cpu() before percpu
init at main.c::setup_per_cpu_areas()

Signed-off-by: Jia He <justin.he@arm.com>
---
 arch/arm64/include/asm/cpu.h |  1 +
 arch/arm64/kernel/cpuinfo.c  | 13 ++++++++++---
 arch/arm64/kernel/setup.c    | 14 +++++++++-----
 arch/arm64/kernel/smp.c      |  3 +--
 4 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
index 7faae6ff3ab4..59f36f5e3c04 100644
--- a/arch/arm64/include/asm/cpu.h
+++ b/arch/arm64/include/asm/cpu.h
@@ -63,6 +63,7 @@ DECLARE_PER_CPU(struct cpuinfo_arm64, cpu_data);
 
 void cpuinfo_store_cpu(void);
 void __init cpuinfo_store_boot_cpu(void);
+void __init save_boot_cpuinfo_data(void);
 
 void __init init_cpu_features(struct cpuinfo_arm64 *info);
 void update_cpu_features(int cpu, struct cpuinfo_arm64 *info,
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 77605aec25fe..f8de5b8bae20 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -413,9 +413,16 @@ void cpuinfo_store_cpu(void)
 
 void __init cpuinfo_store_boot_cpu(void)
 {
-	struct cpuinfo_arm64 *info = &per_cpu(cpu_data, 0);
-	__cpuinfo_store_cpu(info);
+	__cpuinfo_store_cpu(&boot_cpu_data);
 
-	boot_cpu_data = *info;
 	init_cpu_features(&boot_cpu_data);
 }
+
+void __init save_boot_cpuinfo_data(void)
+{
+	struct cpuinfo_arm64 *info;
+
+	set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
+	info = &per_cpu(cpu_data, 0);
+	*info = boot_cpu_data;
+}
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 1a57a76e1cc2..e078ab068f3b 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -297,16 +297,20 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
 	 */
 	arm64_use_ng_mappings = kaslr_requires_kpti();
 
-	early_fixmap_init();
-	early_ioremap_init();
-
-	setup_machine_fdt(__fdt_pointer);
-
 	/*
 	 * Initialise the static keys early as they may be enabled by the
 	 * cpufeature code and early parameters.
 	 */
 	jump_label_init();
+
+	/* Init the cpu feature codes for boot cpu */
+	cpuinfo_store_boot_cpu();
+
+	early_fixmap_init();
+	early_ioremap_init();
+
+	setup_machine_fdt(__fdt_pointer);
+
 	parse_early_param();
 
 	/*
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 2499b895efea..3df1f5b1da0b 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -449,8 +449,7 @@ void __init smp_cpus_done(unsigned int max_cpus)
 
 void __init smp_prepare_boot_cpu(void)
 {
-	set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
-	cpuinfo_store_boot_cpu();
+	save_boot_cpuinfo_data();
 
 	/*
 	 * We now know enough about the boot CPU to apply the
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [RFC PATCH 2/2] arm64: kpti: Update arm64_use_ng_mappings before pagetable mapping
  2021-01-13  1:40 [RFC PATCH 0/2] Avoid booting stall caused by Jia He
  2021-01-13  1:40 ` [RFC PATCH 1/2] arm64/cpuinfo: Move init_cpu_features() ahead of setup.c::early_fixmap_init() Jia He
@ 2021-01-13  1:40 ` Jia He
  2021-01-26 14:14   ` Will Deacon
  1 sibling, 1 reply; 6+ messages in thread
From: Jia He @ 2021-01-13  1:40 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel
  Cc: Anshuman Khandual, Suzuki K Poulose, Jia He, Mark Rutland,
	Gustavo A. R. Silva, Richard Henderson, Dave Martin,
	Steven Price, Andrew Morton, Mike Rapoport, Ard Biesheuvel,
	Gavin Shan, Kefeng Wang, Mark Brown, Marc Zyngier,
	Cristian Marussi

There is a 10s stall in idmap_kpti_install_ng_mappings when kernel boots
on a Ampere EMAG server.

Commit f992b4dfd58b ("arm64: kpti: Add ->enable callback to remap
swapper using nG mappings") updates the nG bit runtime if kpti is required.
But things get worse if rodata=full in map_mem(). NO_BLOCK_MAPPINGS |
NO_CONT_MAPPINGS is required when creating pagetable mapping. Hence all
ptes are fully mapped in this case. On a Ampere EMAG server with 256G
memory(pagesize=4k), it causes the 10s stall.

After previous commit moving init_cpu_features(), we can use
cpu_have_const_cap earlier than before. Hence we can avoid this stall
by updating arm64_use_ng_mappings.

Signed-off-by: Jia He <justin.he@arm.com>
---
 arch/arm64/kernel/setup.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index e078ab068f3b..51098ceb7159 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -306,6 +306,10 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
 	/* Init the cpu feature codes for boot cpu */
 	cpuinfo_store_boot_cpu();
 
+	/* ARM64_UNMAP_KERNEL_AT_EL0 cap can be updated in cpuinfo_store_boot_cpu() */
+	if (!arm64_use_ng_mappings)
+		arm64_use_ng_mappings = cpus_have_const_cap(ARM64_UNMAP_KERNEL_AT_EL0);
+
 	early_fixmap_init();
 	early_ioremap_init();
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH 1/2] arm64/cpuinfo: Move init_cpu_features() ahead of setup.c::early_fixmap_init()
  2021-01-13  1:40 ` [RFC PATCH 1/2] arm64/cpuinfo: Move init_cpu_features() ahead of setup.c::early_fixmap_init() Jia He
@ 2021-01-26 13:57   ` Will Deacon
  2021-01-26 14:11     ` Mark Rutland
  0 siblings, 1 reply; 6+ messages in thread
From: Will Deacon @ 2021-01-26 13:57 UTC (permalink / raw)
  To: Jia He
  Cc: Catalin Marinas, linux-arm-kernel, linux-kernel,
	Anshuman Khandual, Suzuki K Poulose, Mark Rutland,
	Gustavo A. R. Silva, Richard Henderson, Dave Martin,
	Steven Price, Andrew Morton, Mike Rapoport, Ard Biesheuvel,
	Gavin Shan, Kefeng Wang, Mark Brown, Marc Zyngier,
	Cristian Marussi

On Wed, Jan 13, 2021 at 09:40:46AM +0800, Jia He wrote:
> Move init_cpu_features() ahead of setup_arch()->early_fixmap_init(), which
> is the preparation work for checking the condition to assign
> arm64_use_ng_mappings as cpus_have_const_cap(ARM64_UNMAP_KERNEL_AT_EL0).
> 
> Besides, jump_label_init() is also moved ahead because
> cpus_have_const_cap() depends on static key enable api.
> 
> Percpu helpers should be avoided in cpuinfo_store_boot_cpu() before percpu
> init at main.c::setup_per_cpu_areas()
> 
> Signed-off-by: Jia He <justin.he@arm.com>
> ---
>  arch/arm64/include/asm/cpu.h |  1 +
>  arch/arm64/kernel/cpuinfo.c  | 13 ++++++++++---
>  arch/arm64/kernel/setup.c    | 14 +++++++++-----
>  arch/arm64/kernel/smp.c      |  3 +--
>  4 files changed, 21 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
> index 7faae6ff3ab4..59f36f5e3c04 100644
> --- a/arch/arm64/include/asm/cpu.h
> +++ b/arch/arm64/include/asm/cpu.h
> @@ -63,6 +63,7 @@ DECLARE_PER_CPU(struct cpuinfo_arm64, cpu_data);
>  
>  void cpuinfo_store_cpu(void);
>  void __init cpuinfo_store_boot_cpu(void);
> +void __init save_boot_cpuinfo_data(void);
>  
>  void __init init_cpu_features(struct cpuinfo_arm64 *info);
>  void update_cpu_features(int cpu, struct cpuinfo_arm64 *info,
> diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
> index 77605aec25fe..f8de5b8bae20 100644
> --- a/arch/arm64/kernel/cpuinfo.c
> +++ b/arch/arm64/kernel/cpuinfo.c
> @@ -413,9 +413,16 @@ void cpuinfo_store_cpu(void)
>  
>  void __init cpuinfo_store_boot_cpu(void)
>  {
> -	struct cpuinfo_arm64 *info = &per_cpu(cpu_data, 0);
> -	__cpuinfo_store_cpu(info);
> +	__cpuinfo_store_cpu(&boot_cpu_data);
>  
> -	boot_cpu_data = *info;
>  	init_cpu_features(&boot_cpu_data);
>  }
> +
> +void __init save_boot_cpuinfo_data(void)
> +{
> +	struct cpuinfo_arm64 *info;
> +
> +	set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
> +	info = &per_cpu(cpu_data, 0);
> +	*info = boot_cpu_data;
> +}
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index 1a57a76e1cc2..e078ab068f3b 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -297,16 +297,20 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
>  	 */
>  	arm64_use_ng_mappings = kaslr_requires_kpti();
>  
> -	early_fixmap_init();
> -	early_ioremap_init();
> -
> -	setup_machine_fdt(__fdt_pointer);
> -
>  	/*
>  	 * Initialise the static keys early as they may be enabled by the
>  	 * cpufeature code and early parameters.
>  	 */
>  	jump_label_init();

I don't think your patch changes this, but afaict jump_label_init() uses
per-cpu variables via cpus_read_lock(), yet we don't initialise our offset
until later on. Any idea how that works?

Will

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH 1/2] arm64/cpuinfo: Move init_cpu_features() ahead of setup.c::early_fixmap_init()
  2021-01-26 13:57   ` Will Deacon
@ 2021-01-26 14:11     ` Mark Rutland
  0 siblings, 0 replies; 6+ messages in thread
From: Mark Rutland @ 2021-01-26 14:11 UTC (permalink / raw)
  To: Will Deacon
  Cc: Jia He, Catalin Marinas, linux-arm-kernel, linux-kernel,
	Anshuman Khandual, Suzuki K Poulose, Gustavo A. R. Silva,
	Richard Henderson, Dave Martin, Steven Price, Andrew Morton,
	Mike Rapoport, Ard Biesheuvel, Gavin Shan, Kefeng Wang,
	Mark Brown, Marc Zyngier, Cristian Marussi

On Tue, Jan 26, 2021 at 01:57:13PM +0000, Will Deacon wrote:
> > @@ -297,16 +297,20 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
> >  	 */
> >  	arm64_use_ng_mappings = kaslr_requires_kpti();
> >  
> > -	early_fixmap_init();
> > -	early_ioremap_init();
> > -
> > -	setup_machine_fdt(__fdt_pointer);
> > -
> >  	/*
> >  	 * Initialise the static keys early as they may be enabled by the
> >  	 * cpufeature code and early parameters.
> >  	 */
> >  	jump_label_init();
> 
> I don't think your patch changes this, but afaict jump_label_init() uses
> per-cpu variables via cpus_read_lock(), yet we don't initialise our offset
> until later on. Any idea how that works?

We initialize the boot CPU's offset twice during boot, once before this
in smp_setup_processor_id(), and once afterwards in
smp_prepare_boot_cpu() since setup_per_cpu_areas() will allocate a new
region for CPU0.

IIUC per-cpu writes before smp_prepare_boot_cpu() are potentially dodgy
since they might be copied to other CPUs, but reads are all fine.

Mark.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH 2/2] arm64: kpti: Update arm64_use_ng_mappings before pagetable mapping
  2021-01-13  1:40 ` [RFC PATCH 2/2] arm64: kpti: Update arm64_use_ng_mappings before pagetable mapping Jia He
@ 2021-01-26 14:14   ` Will Deacon
  0 siblings, 0 replies; 6+ messages in thread
From: Will Deacon @ 2021-01-26 14:14 UTC (permalink / raw)
  To: Jia He
  Cc: Catalin Marinas, linux-arm-kernel, linux-kernel,
	Anshuman Khandual, Suzuki K Poulose, Mark Rutland,
	Gustavo A. R. Silva, Richard Henderson, Dave Martin,
	Steven Price, Andrew Morton, Mike Rapoport, Ard Biesheuvel,
	Gavin Shan, Kefeng Wang, Mark Brown, Marc Zyngier,
	Cristian Marussi

On Wed, Jan 13, 2021 at 09:40:47AM +0800, Jia He wrote:
> There is a 10s stall in idmap_kpti_install_ng_mappings when kernel boots
> on a Ampere EMAG server.
> 
> Commit f992b4dfd58b ("arm64: kpti: Add ->enable callback to remap
> swapper using nG mappings") updates the nG bit runtime if kpti is required.
> But things get worse if rodata=full in map_mem(). NO_BLOCK_MAPPINGS |
> NO_CONT_MAPPINGS is required when creating pagetable mapping. Hence all
> ptes are fully mapped in this case. On a Ampere EMAG server with 256G
> memory(pagesize=4k), it causes the 10s stall.
> 
> After previous commit moving init_cpu_features(), we can use
> cpu_have_const_cap earlier than before. Hence we can avoid this stall
> by updating arm64_use_ng_mappings.
> 
> Signed-off-by: Jia He <justin.he@arm.com>
> ---
>  arch/arm64/kernel/setup.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index e078ab068f3b..51098ceb7159 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -306,6 +306,10 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
>  	/* Init the cpu feature codes for boot cpu */
>  	cpuinfo_store_boot_cpu();
>  
> +	/* ARM64_UNMAP_KERNEL_AT_EL0 cap can be updated in cpuinfo_store_boot_cpu() */
> +	if (!arm64_use_ng_mappings)
> +		arm64_use_ng_mappings = cpus_have_const_cap(ARM64_UNMAP_KERNEL_AT_EL0);

Are you sure it's safe to run the cpu feature initialisation code this
early? For example, we haven't even parsed the command-line yet, so I think
a fair amount of stuff will break.

Of course, you could also just pass "mitigations=off" if you want your
performance back.

Will

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-01-26 14:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-13  1:40 [RFC PATCH 0/2] Avoid booting stall caused by Jia He
2021-01-13  1:40 ` [RFC PATCH 1/2] arm64/cpuinfo: Move init_cpu_features() ahead of setup.c::early_fixmap_init() Jia He
2021-01-26 13:57   ` Will Deacon
2021-01-26 14:11     ` Mark Rutland
2021-01-13  1:40 ` [RFC PATCH 2/2] arm64: kpti: Update arm64_use_ng_mappings before pagetable mapping Jia He
2021-01-26 14:14   ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).