linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/64/mm: Map all kernel memory into trampoline_pgd
@ 2021-09-13  9:52 Joerg Roedel
  2021-09-14  7:52 ` Mike Rapoport
  0 siblings, 1 reply; 3+ messages in thread
From: Joerg Roedel @ 2021-09-13  9:52 UTC (permalink / raw)
  To: x86
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa, jroedel,
	Mike Rapoport, Andrew Morton, Brijesh Singh, linux-kernel, joro,
	stable

From: Joerg Roedel <jroedel@suse.de>

The trampoline_pgd only maps the 0xfffffff000000000-0xffffffffffffffff
range of kernel memory (with 4-level paging). This range contains the
kernels text+data+bss mappings and the module mapping space, but not the
direct mapping and the vmalloc area.

This is enough to get an application processors out of real-mode, but
for code that switches back to real-mode the trampoline_pgd is missing
important parts of the address space. For example, consider this code
from arch/x86/kernel/reboot.c, function machine_real_restart() for a
64-bit kernel:

	#ifdef CONFIG_X86_32
		load_cr3(initial_page_table);
	#else
		write_cr3(real_mode_header->trampoline_pgd);

		/* Exiting long mode will fail if CR4.PCIDE is set. */
		if (boot_cpu_has(X86_FEATURE_PCID))
			cr4_clear_bits(X86_CR4_PCIDE);
	#endif

		/* Jump to the identity-mapped low memory code */
	#ifdef CONFIG_X86_32
		asm volatile("jmpl *%0" : :
			     "rm" (real_mode_header->machine_real_restart_asm),
			     "a" (type));
	#else
		asm volatile("ljmpl *%0" : :
			     "m" (real_mode_header->machine_real_restart_asm),
			     "D" (type));
	#endif

The code switches to the trampoline_pgd, which unmaps the direct mapping
and also the kernel stack. The call to cr4_clear_bits() will find no
stack and crash the machine. The real_mode_header pointer below points
into the direct mapping, and dereferencing it also causes a crash.

The reason this does not crash always is only that kernel mappings are
global and the CR3 switch does not flush those mappings. But if theses
mappings are not in the TLB already, the above code will crash before it
can jump to the real-mode stub.

Extend the trampoline_pgd to contain all kernel mappings to prevent
these crashes and to make code which runs on this page-table more
robust.

Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 arch/x86/realmode/init.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 31b5856010cb..7a08c96cb42a 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -72,6 +72,7 @@ static void __init setup_real_mode(void)
 #ifdef CONFIG_X86_64
 	u64 *trampoline_pgd;
 	u64 efer;
+	int i;
 #endif
 
 	base = (unsigned char *)real_mode_header;
@@ -128,8 +129,17 @@ static void __init setup_real_mode(void)
 	trampoline_header->flags = 0;
 
 	trampoline_pgd = (u64 *) __va(real_mode_header->trampoline_pgd);
+
+	/*
+	 * Map all of kernel memory into the trampoline PGD so that it includes
+	 * the direct mapping and vmalloc space. This is needed to keep the
+	 * stack and real_mode_header mapped when switching to this page table.
+	 */
+	for (i = pgd_index(__PAGE_OFFSET); i < PTRS_PER_PGD; i++)
+		trampoline_pgd[i] = init_top_pgt[i].pgd;
+
+	/* Map the real mode stub as virtual == physical */
 	trampoline_pgd[0] = trampoline_pgd_entry.pgd;
-	trampoline_pgd[511] = init_top_pgt[511].pgd;
 #endif
 
 	sme_sev_setup_real_mode(trampoline_header);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] x86/64/mm: Map all kernel memory into trampoline_pgd
  2021-09-13  9:52 [PATCH] x86/64/mm: Map all kernel memory into trampoline_pgd Joerg Roedel
@ 2021-09-14  7:52 ` Mike Rapoport
  2021-09-15 11:49   ` Joerg Roedel
  0 siblings, 1 reply; 3+ messages in thread
From: Mike Rapoport @ 2021-09-14  7:52 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: x86, Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa, jroedel,
	Andrew Morton, Brijesh Singh, linux-kernel, stable

On Mon, Sep 13, 2021 at 11:52:36AM +0200, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@suse.de>
> 
> The trampoline_pgd only maps the 0xfffffff000000000-0xffffffffffffffff
> range of kernel memory (with 4-level paging). This range contains the
> kernels text+data+bss mappings and the module mapping space, but not the
> direct mapping and the vmalloc area.
> 
> This is enough to get an application processors out of real-mode, but
> for code that switches back to real-mode the trampoline_pgd is missing
> important parts of the address space. For example, consider this code
> from arch/x86/kernel/reboot.c, function machine_real_restart() for a
> 64-bit kernel:
> 
> 	#ifdef CONFIG_X86_32
> 		load_cr3(initial_page_table);
> 	#else
> 		write_cr3(real_mode_header->trampoline_pgd);
> 
> 		/* Exiting long mode will fail if CR4.PCIDE is set. */
> 		if (boot_cpu_has(X86_FEATURE_PCID))
> 			cr4_clear_bits(X86_CR4_PCIDE);
> 	#endif
> 
> 		/* Jump to the identity-mapped low memory code */
> 	#ifdef CONFIG_X86_32
> 		asm volatile("jmpl *%0" : :
> 			     "rm" (real_mode_header->machine_real_restart_asm),
> 			     "a" (type));
> 	#else
> 		asm volatile("ljmpl *%0" : :
> 			     "m" (real_mode_header->machine_real_restart_asm),
> 			     "D" (type));
> 	#endif
> 
> The code switches to the trampoline_pgd, which unmaps the direct mapping
> and also the kernel stack. The call to cr4_clear_bits() will find no
> stack and crash the machine. The real_mode_header pointer below points
> into the direct mapping, and dereferencing it also causes a crash.
> 
> The reason this does not crash always is only that kernel mappings are
> global and the CR3 switch does not flush those mappings. But if theses
> mappings are not in the TLB already, the above code will crash before it
> can jump to the real-mode stub.
> 
> Extend the trampoline_pgd to contain all kernel mappings to prevent
> these crashes and to make code which runs on this page-table more
> robust.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
>  arch/x86/realmode/init.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
> index 31b5856010cb..7a08c96cb42a 100644
> --- a/arch/x86/realmode/init.c
> +++ b/arch/x86/realmode/init.c
> @@ -72,6 +72,7 @@ static void __init setup_real_mode(void)
>  #ifdef CONFIG_X86_64
>  	u64 *trampoline_pgd;
>  	u64 efer;
> +	int i;
>  #endif
>  
>  	base = (unsigned char *)real_mode_header;
> @@ -128,8 +129,17 @@ static void __init setup_real_mode(void)
>  	trampoline_header->flags = 0;
>  
>  	trampoline_pgd = (u64 *) __va(real_mode_header->trampoline_pgd);
> +
> +	/*
> +	 * Map all of kernel memory into the trampoline PGD so that it includes
> +	 * the direct mapping and vmalloc space. This is needed to keep the
> +	 * stack and real_mode_header mapped when switching to this page table.
> +	 */
> +	for (i = pgd_index(__PAGE_OFFSET); i < PTRS_PER_PGD; i++)
> +		trampoline_pgd[i] = init_top_pgt[i].pgd;

Don't we need to update the trampoline_pgd in sync_global_pgds() as well?

> +
> +	/* Map the real mode stub as virtual == physical */
>  	trampoline_pgd[0] = trampoline_pgd_entry.pgd;
> -	trampoline_pgd[511] = init_top_pgt[511].pgd;
>  #endif
>  
>  	sme_sev_setup_real_mode(trampoline_header);
> -- 
> 2.33.0
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] x86/64/mm: Map all kernel memory into trampoline_pgd
  2021-09-14  7:52 ` Mike Rapoport
@ 2021-09-15 11:49   ` Joerg Roedel
  0 siblings, 0 replies; 3+ messages in thread
From: Joerg Roedel @ 2021-09-15 11:49 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: x86, Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa, jroedel,
	Andrew Morton, Brijesh Singh, linux-kernel, stable

Hi Mike,

On Tue, Sep 14, 2021 at 10:52:39AM +0300, Mike Rapoport wrote:
> On Mon, Sep 13, 2021 at 11:52:36AM +0200, Joerg Roedel wrote:
> > +	for (i = pgd_index(__PAGE_OFFSET); i < PTRS_PER_PGD; i++)
> > +		trampoline_pgd[i] = init_top_pgt[i].pgd;
> 
> Don't we need to update the trampoline_pgd in sync_global_pgds() as well?

No, the trampoline_pgd is setup after preallocate_vmalloc_pages(), so
everything that would need synchronization is already in the reference
page-table.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-09-15 11:50 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-13  9:52 [PATCH] x86/64/mm: Map all kernel memory into trampoline_pgd Joerg Roedel
2021-09-14  7:52 ` Mike Rapoport
2021-09-15 11:49   ` Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).