All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/kernel: Validate ROM before DMI scanning when SEV-SNP is active
@ 2024-02-13  4:07 Kevin Loughlin
  2024-02-13 20:02 ` Michael Roth
  0 siblings, 1 reply; 24+ messages in thread
From: Kevin Loughlin @ 2024-02-13  4:07 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Baoquan He, Josh Poimboeuf,
	Peter Zijlstra (Intel),
	Yuntao Wang, Kevin Loughlin, Tom Lendacky, Ard Biesheuvel,
	Dionna Glaze, Alexander Shishkin, Ross Lagerwall, Kai Huang,
	Brijesh Singh, linux-kernel
  Cc: Adam Dunlap, Peter Gonda, Jacob Xu, Sidharth Telang,
	Conrad Grobler, Andri Saar

SEV-SNP requires encrypted memory to be validated before access. The
kernel is responsible for validating the ROM memory range because the
range is not part of the e820 table and therefore not pre-validated by
the BIOS.

While the current SEV-SNP code attempts to validate the ROM range in
probe_roms(), this does not suffice for all existing use cases. In
particular, if EFI_CONFIG_TABLES are not enabled and
CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
falls in the ROM range) prior to validation. The specific problematic
call chain occurs during dmi_setup() -> dmi_scan_machine() and results
in a crash during boot if SEV-SNP is enabled under these conditions.

This commit thus provides the simple solution of moving the ROM range
validation from probe_roms() to before dmi_setup(), such that a SEV-SNP
guest satisfying the above use case now successfully boots.

Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
---
 arch/x86/include/asm/setup.h |  6 ++++++
 arch/x86/kernel/probe_roms.c | 19 +++++++++----------
 arch/x86/kernel/setup.c      | 10 ++++++++++
 3 files changed, 25 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index 5c83729c8e71..5c8f5b0d0f9f 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -117,6 +117,12 @@ void *extend_brk(size_t size, size_t align);
 	__section(".bss..brk") __aligned(1) __used	\
 	static char __brk_##name[size]
 
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+void snp_prep_rom_range(void);
+#else
+static inline void snp_prep_rom_range(void) { }
+#endif
+
 extern void probe_roms(void);
 
 void clear_bss(void);
diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
index 319fef37d9dc..83b192f5e3cc 100644
--- a/arch/x86/kernel/probe_roms.c
+++ b/arch/x86/kernel/probe_roms.c
@@ -196,6 +196,15 @@ static int __init romchecksum(const unsigned char *rom, unsigned long length)
 	return !length && !sum;
 }
 
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+void __init snp_prep_rom_range(void)
+{
+	snp_prep_memory(video_rom_resource.start,
+			((system_rom_resource.end + 1) - video_rom_resource.start),
+			SNP_PAGE_STATE_PRIVATE);
+}
+#endif
+
 void __init probe_roms(void)
 {
 	unsigned long start, length, upper;
@@ -203,16 +212,6 @@ void __init probe_roms(void)
 	unsigned char c;
 	int i;
 
-	/*
-	 * The ROM memory range is not part of the e820 table and is therefore not
-	 * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted
-	 * memory, and SNP requires encrypted memory to be validated before access.
-	 * Do that here.
-	 */
-	snp_prep_memory(video_rom_resource.start,
-			((system_rom_resource.end + 1) - video_rom_resource.start),
-			SNP_PAGE_STATE_PRIVATE);
-
 	/* video rom */
 	upper = adapter_rom_resources[0].start;
 	for (start = video_rom_resource.start; start < upper; start += 2048) {
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 84201071dfac..19f870728486 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -902,6 +902,16 @@ void __init setup_arch(char **cmdline_p)
 		efi_init();
 
 	reserve_ibft_region();
+
+	/*
+	 * The ROM memory range is not part of the e820 table and is therefore not
+	 * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted
+	 * memory, and SNP requires encrypted memory to be validated before access.
+	 * This should be done before dmi_setup(), which may access the ROM region
+	 * even before probe_roms() is called.
+	 */
+	snp_prep_rom_range();
+
 	dmi_setup();
 
 	/*
-- 
2.43.0.687.g38aa6559b0-goog


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/kernel: Validate ROM before DMI scanning when SEV-SNP is active
  2024-02-13  4:07 [PATCH] x86/kernel: Validate ROM before DMI scanning when SEV-SNP is active Kevin Loughlin
@ 2024-02-13 20:02 ` Michael Roth
  2024-02-13 23:10   ` Kevin Loughlin
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Roth @ 2024-02-13 20:02 UTC (permalink / raw)
  To: Alexander Shishkin, Ard Biesheuvel, Baoquan He, Borislav Petkov,
	Brijesh Singh, Dave Hansen, Dionna Glaze, H.Peter Anvin,
	Ingo Molnar, Josh Poimboeuf, Kai Huang, Kevin Loughlin,
	Peter Zijlstra, Ross Lagerwall, Thomas Gleixner, Tom Lendacky,
	Yuntao Wang, linux-kernel, x86
  Cc: Adam Dunlap, Peter Gonda, Jacob Xu, Sidharth Telang,
	Conrad Grobler, Andri Saar

Quoting Kevin Loughlin (2024-02-12 22:07:46)
> SEV-SNP requires encrypted memory to be validated before access. The
> kernel is responsible for validating the ROM memory range because the
> range is not part of the e820 table and therefore not pre-validated by
> the BIOS.
> 
> While the current SEV-SNP code attempts to validate the ROM range in
> probe_roms(), this does not suffice for all existing use cases. In
> particular, if EFI_CONFIG_TABLES are not enabled and
> CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
> attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> falls in the ROM range) prior to validation. The specific problematic
> call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> in a crash during boot if SEV-SNP is enabled under these conditions.

AFAIK, QEMU doesn't actually include any legacy ROMs as part of the initial
encrypted guest image, and I'm not aware of any VMM implementations that
do this either. As a result, it seems like snp_prep_rom_range() would
only result in the guest seeing ciphertext in these ranges.

If dmi_setup() similarly scans these ranges, it seems likely the same
issue would be present: the validated/private regions would only contain
ciphertext rather than the expected ROM data. Does that agree with the
behavior you are seeing?

If so, maybe instead probe_roms should just be skipped in the case of SNP?
And perhaps dmi_setup() should similarly skip the legacy ROM ranges for
the kernel configs in question?

-Mike

> 
> This commit thus provides the simple solution of moving the ROM range
> validation from probe_roms() to before dmi_setup(), such that a SEV-SNP
> guest satisfying the above use case now successfully boots.
> 
> Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
> Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
> ---
>  arch/x86/include/asm/setup.h |  6 ++++++
>  arch/x86/kernel/probe_roms.c | 19 +++++++++----------
>  arch/x86/kernel/setup.c      | 10 ++++++++++
>  3 files changed, 25 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
> index 5c83729c8e71..5c8f5b0d0f9f 100644
> --- a/arch/x86/include/asm/setup.h
> +++ b/arch/x86/include/asm/setup.h
> @@ -117,6 +117,12 @@ void *extend_brk(size_t size, size_t align);
>         __section(".bss..brk") __aligned(1) __used      \
>         static char __brk_##name[size]
>  
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +void snp_prep_rom_range(void);
> +#else
> +static inline void snp_prep_rom_range(void) { }
> +#endif
> +
>  extern void probe_roms(void);
>  
>  void clear_bss(void);
> diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
> index 319fef37d9dc..83b192f5e3cc 100644
> --- a/arch/x86/kernel/probe_roms.c
> +++ b/arch/x86/kernel/probe_roms.c
> @@ -196,6 +196,15 @@ static int __init romchecksum(const unsigned char *rom, unsigned long length)
>         return !length && !sum;
>  }
>  
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +void __init snp_prep_rom_range(void)
> +{
> +       snp_prep_memory(video_rom_resource.start,
> +                       ((system_rom_resource.end + 1) - video_rom_resource.start),
> +                       SNP_PAGE_STATE_PRIVATE);
> +}
> +#endif
> +
>  void __init probe_roms(void)
>  {
>         unsigned long start, length, upper;
> @@ -203,16 +212,6 @@ void __init probe_roms(void)
>         unsigned char c;
>         int i;
>  
> -       /*
> -        * The ROM memory range is not part of the e820 table and is therefore not
> -        * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted
> -        * memory, and SNP requires encrypted memory to be validated before access.
> -        * Do that here.
> -        */
> -       snp_prep_memory(video_rom_resource.start,
> -                       ((system_rom_resource.end + 1) - video_rom_resource.start),
> -                       SNP_PAGE_STATE_PRIVATE);
> -
>         /* video rom */
>         upper = adapter_rom_resources[0].start;
>         for (start = video_rom_resource.start; start < upper; start += 2048) {
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 84201071dfac..19f870728486 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -902,6 +902,16 @@ void __init setup_arch(char **cmdline_p)
>                 efi_init();
>  
>         reserve_ibft_region();
> +
> +       /*
> +        * The ROM memory range is not part of the e820 table and is therefore not
> +        * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted
> +        * memory, and SNP requires encrypted memory to be validated before access.
> +        * This should be done before dmi_setup(), which may access the ROM region
> +        * even before probe_roms() is called.
> +        */
> +       snp_prep_rom_range();
> +
>         dmi_setup();
>  
>         /*
> -- 
> 2.43.0.687.g38aa6559b0-goog
> 
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/kernel: Validate ROM before DMI scanning when SEV-SNP is active
  2024-02-13 20:02 ` Michael Roth
@ 2024-02-13 23:10   ` Kevin Loughlin
  2024-02-16 22:50     ` Michael Roth
  0 siblings, 1 reply; 24+ messages in thread
From: Kevin Loughlin @ 2024-02-13 23:10 UTC (permalink / raw)
  To: Michael Roth
  Cc: Alexander Shishkin, Ard Biesheuvel, Baoquan He, Borislav Petkov,
	Brijesh Singh, Dave Hansen, Dionna Glaze, H.Peter Anvin,
	Ingo Molnar, Josh Poimboeuf, Kai Huang, Peter Zijlstra,
	Ross Lagerwall, Thomas Gleixner, Tom Lendacky, Yuntao Wang,
	linux-kernel, x86, Adam Dunlap, Peter Gonda, Jacob Xu,
	Sidharth Telang, Conrad Grobler, Andri Saar

On Tue, Feb 13, 2024 at 12:03 PM Michael Roth <michael.roth@amd.com> wrote:
>
> Quoting Kevin Loughlin (2024-02-12 22:07:46)
> > SEV-SNP requires encrypted memory to be validated before access. The
> > kernel is responsible for validating the ROM memory range because the
> > range is not part of the e820 table and therefore not pre-validated by
> > the BIOS.
> >
> > While the current SEV-SNP code attempts to validate the ROM range in
> > probe_roms(), this does not suffice for all existing use cases. In
> > particular, if EFI_CONFIG_TABLES are not enabled and
> > CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
> > attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> > falls in the ROM range) prior to validation. The specific problematic
> > call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> > in a crash during boot if SEV-SNP is enabled under these conditions.
>
> AFAIK, QEMU doesn't actually include any legacy ROMs as part of the initial
> encrypted guest image, and I'm not aware of any VMM implementations that
> do this either.

I'm using a VMM implementation that uses (non-EFI) Oak stage0 firmware [0].

[0] https://github.com/project-oak/oak/tree/main/stage0_bin

> If dmi_setup() similarly scans these ranges, it seems likely the same
> issue would be present: the validated/private regions would only contain
> ciphertext rather than the expected ROM data. Does that agree with the
> behavior you are seeing?
>
> If so, maybe instead probe_roms should just be skipped in the case of SNP?

If probe_roms() is skipped, SEV-SNP guest boot also currently crashes;
I just quickly tried that (though admittedly haven't looked into why).
Apparently though, the fix for early ROM range accesses is not as
simple as just skipping probe_roms() if SEV-SNP is enabled.
Furthermore, skipping probe_roms() was also *not* the route taken in
the initial attempt that prevents this issue for EFI use cases [1].

[1] https://lore.kernel.org/lkml/20220307213356.2797205-21-brijesh.singh@amd.com/

> And perhaps dmi_setup() should similarly skip the legacy ROM ranges for
> the kernel configs in question?

Given (a) non-EFI firmware is supported in other SME/SEV boot code
patches [2], (b) this patch does not seem to introduce significant
complexity (it just moves [1] to earlier in the boot process to
additionally handle the non-EFI case), and (c) skipping
probe_roms()+dmi_setup() doesn't work without additional changes, I'm
currently still inclined to simply validate the legacy ROM ranges
early enough to prevent this issue (as is already done when using EFI
firmware).

[2] https://lore.kernel.org/lkml/CAMj1kXFZKM5wU8djcVBxDmnCJwV4Xpest6u1EbE=7wyLUUeUUQ@mail.gmail.com/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/kernel: Validate ROM before DMI scanning when SEV-SNP is active
  2024-02-13 23:10   ` Kevin Loughlin
@ 2024-02-16 22:50     ` Michael Roth
  2024-02-21 22:50       ` Kevin Loughlin
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Roth @ 2024-02-16 22:50 UTC (permalink / raw)
  To: Kevin Loughlin
  Cc: Alexander Shishkin, Ard Biesheuvel, Baoquan He, Borislav Petkov,
	Brijesh Singh, Dave Hansen, Dionna Glaze, H.Peter Anvin,
	Ingo Molnar, Josh Poimboeuf, Kai Huang, Peter Zijlstra,
	Ross Lagerwall, Thomas Gleixner, Tom Lendacky, Yuntao Wang,
	linux-kernel, x86, Adam Dunlap, Peter Gonda, Jacob Xu,
	Sidharth Telang, Conrad Grobler, Andri Saar

On Tue, Feb 13, 2024 at 03:10:46PM -0800, Kevin Loughlin wrote:
> On Tue, Feb 13, 2024 at 12:03 PM Michael Roth <michael.roth@amd.com> wrote:
> >
> > Quoting Kevin Loughlin (2024-02-12 22:07:46)
> > > SEV-SNP requires encrypted memory to be validated before access. The
> > > kernel is responsible for validating the ROM memory range because the
> > > range is not part of the e820 table and therefore not pre-validated by
> > > the BIOS.
> > >
> > > While the current SEV-SNP code attempts to validate the ROM range in
> > > probe_roms(), this does not suffice for all existing use cases. In
> > > particular, if EFI_CONFIG_TABLES are not enabled and
> > > CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
> > > attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> > > falls in the ROM range) prior to validation. The specific problematic
> > > call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> > > in a crash during boot if SEV-SNP is enabled under these conditions.
> >
> > AFAIK, QEMU doesn't actually include any legacy ROMs as part of the initial
> > encrypted guest image, and I'm not aware of any VMM implementations that
> > do this either.
> 
> I'm using a VMM implementation that uses (non-EFI) Oak stage0 firmware [0].
> 
> [0] https://github.com/project-oak/oak/tree/main/stage0_bin
> 
> > If dmi_setup() similarly scans these ranges, it seems likely the same
> > issue would be present: the validated/private regions would only contain
> > ciphertext rather than the expected ROM data. Does that agree with the
> > behavior you are seeing?
> >
> > If so, maybe instead probe_roms should just be skipped in the case of SNP?
> 
> If probe_roms() is skipped, SEV-SNP guest boot also currently crashes;
> I just quickly tried that (though admittedly haven't looked into why).

default_find_smp_config() will also call smp_scan_config() on
0xF0000-0x10000, so that might be the additional issue you're hitting.
If I skip that for in addition to probe_roms, then boot works for me.

The dmi_setup() case you hit would also need similar handling if taking
this approach.

> Apparently though, the fix for early ROM range accesses is not as
> simple as just skipping probe_roms() if SEV-SNP is enabled.
> Furthermore, skipping probe_roms() was also *not* the route taken in
> the initial attempt that prevents this issue for EFI use cases [1].
> 
> [1] https://lore.kernel.org/lkml/20220307213356.2797205-21-brijesh.singh@amd.com/

It seems the currently handling has a bug that has been in place since the
original SEV guest code was added. If you dump the data that probe_roms()
sees while it is scanning for instances of ROMSIGNATURE (0xaa55) in the
region, you'll see that it is random data that changes on every boot.
The root issue is that this region does not contain encrypted data, and
is only being accessed that way because the early page table has the
encryption bit set for this range.

The effects are subtle: if the code ever sees a pair of bytes that look
like ROMSIGNATURE, it will reserve that memory so it can be accessed
later, generally just 0xc0000-0xc7fff. In extremely rare cases where the
ciphertext's data has a checksum that happens to match the contents, it
will use a random byte, multiple it by 512, and reserve up to 64k for
this bogus ROM region.

For SNP this resulted in a more obvious failure: a #VC exception because
the supposedly encrypted memory was in fact not encrypted, and thus not
PVALIDATED. Unfortunately the fix you linked to involved maintaining the
broken SEV behavior rather than fixing this mismatch.

> 
> > And perhaps dmi_setup() should similarly skip the legacy ROM ranges for
> > the kernel configs in question?
> 
> Given (a) non-EFI firmware is supported in other SME/SEV boot code
> patches [2], (b) this patch does not seem to introduce significant
> complexity (it just moves [1] to earlier in the boot process to
> additionally handle the non-EFI case), and (c) skipping
> probe_roms()+dmi_setup() doesn't work without additional changes, I'm
> currently still inclined to simply validate the legacy ROM ranges
> early enough to prevent this issue (as is already done when using EFI
> firmware).

The 2 options I see are:

  a) Skipping accesses to these regions for SEV. It is vaguely possible
     some implementation out there actually did measure/load the ROM as
     part of the initial guest image for SEV, but for SNP this would
     have been impossible since it would have lead to the guest crashing
     when snp_prep_roms() was called, since RMPUPDATE on the host only
     rescinds the validated bit if there is a change to the RMP entry.
     If it was already assigned/private/validated then the guest code
     would detected that PVALIDATE resulted in no changes, and so it
     would have failed with PVALIDATE_FAIL_NOUPDATE. So if you want to
     be super sure you don't break legacy SEV implementations then you
     could limit the change to SNP guests where it's essentially
     guaranteed these regions are not being utilized in any functional
     way.

  b) Modifying the early page table setup by early_make_pgtable() to
     clear the encrypted bit for 0xC0000-0x100000 legacy region. The
     challenge there is everything is PMD-mapped at that stage of boot
     and there's no infrastructure for splitting page tables to handle
     non-2MB-aligned/sized regions.

But I don't think continuing to propagate the broken SEV behavior is
the right fix. At some point those random scans may trigger something
more problematic than wasted memory reservations. It may even be the
case already since I haven't audited the dmi_setup()/smp_scan_config()
paths yet, but nothing good/useful can come of it.

-Mike

> 
> [2] https://lore.kernel.org/lkml/CAMj1kXFZKM5wU8djcVBxDmnCJwV4Xpest6u1EbE=7wyLUUeUUQ@mail.gmail.com/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/kernel: Validate ROM before DMI scanning when SEV-SNP is active
  2024-02-16 22:50     ` Michael Roth
@ 2024-02-21 22:50       ` Kevin Loughlin
  2024-02-22 20:20         ` Michael Roth
  2024-02-22 20:24         ` [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests Kevin Loughlin
  0 siblings, 2 replies; 24+ messages in thread
From: Kevin Loughlin @ 2024-02-21 22:50 UTC (permalink / raw)
  To: Michael Roth
  Cc: Alexander Shishkin, Ard Biesheuvel, Baoquan He, Borislav Petkov,
	Brijesh Singh, Dave Hansen, Dionna Glaze, H.Peter Anvin,
	Ingo Molnar, Josh Poimboeuf, Kai Huang, Peter Zijlstra,
	Ross Lagerwall, Thomas Gleixner, Tom Lendacky, Yuntao Wang,
	linux-kernel, x86, Adam Dunlap, Peter Gonda, Jacob Xu,
	Sidharth Telang, Conrad Grobler, Andri Saar

On Fri, Feb 16, 2024 at 2:50 PM Michael Roth <michael.roth@amd.com> wrote:
>
> On Tue, Feb 13, 2024 at 03:10:46PM -0800, Kevin Loughlin wrote:
> > On Tue, Feb 13, 2024 at 12:03 PM Michael Roth <michael.roth@amd.com> wrote:
> > >
> > > Quoting Kevin Loughlin (2024-02-12 22:07:46)
> > > > SEV-SNP requires encrypted memory to be validated before access. The
> > > > kernel is responsible for validating the ROM memory range because the
> > > > range is not part of the e820 table and therefore not pre-validated by
> > > > the BIOS.
> > > >
> > > > While the current SEV-SNP code attempts to validate the ROM range in
> > > > probe_roms(), this does not suffice for all existing use cases. In
> > > > particular, if EFI_CONFIG_TABLES are not enabled and
> > > > CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
> > > > attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> > > > falls in the ROM range) prior to validation. The specific problematic
> > > > call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> > > > in a crash during boot if SEV-SNP is enabled under these conditions.
> > >
> > > AFAIK, QEMU doesn't actually include any legacy ROMs as part of the initial
> > > encrypted guest image, and I'm not aware of any VMM implementations that
> > > do this either.
> >
> > I'm using a VMM implementation that uses (non-EFI) Oak stage0 firmware [0].
> >
> > [0] https://github.com/project-oak/oak/tree/main/stage0_bin
> >
> > > If dmi_setup() similarly scans these ranges, it seems likely the same
> > > issue would be present: the validated/private regions would only contain
> > > ciphertext rather than the expected ROM data. Does that agree with the
> > > behavior you are seeing?
> > >
> > > If so, maybe instead probe_roms should just be skipped in the case of SNP?
> >
> > If probe_roms() is skipped, SEV-SNP guest boot also currently crashes;
> > I just quickly tried that (though admittedly haven't looked into why).
>
> default_find_smp_config() will also call smp_scan_config() on
> 0xF0000-0x10000, so that might be the additional issue you're hitting.
> If I skip that for in addition to probe_roms, then boot works for me.

Yeah, smp_scan_config() was the culprit. Thanks.

> It seems the currently handling has a bug that has been in place since the
> original SEV guest code was added. If you dump the data that probe_roms()
> sees while it is scanning for instances of ROMSIGNATURE (0xaa55) in the
> region, you'll see that it is random data that changes on every boot.
> The root issue is that this region does not contain encrypted data, and
> is only being accessed that way because the early page table has the
> encryption bit set for this range.
>
> The effects are subtle: if the code ever sees a pair of bytes that look
> like ROMSIGNATURE, it will reserve that memory so it can be accessed
> later, generally just 0xc0000-0xc7fff. In extremely rare cases where the
> ciphertext's data has a checksum that happens to match the contents, it
> will use a random byte, multiple it by 512, and reserve up to 64k for
> this bogus ROM region.
>
> For SNP this resulted in a more obvious failure: a #VC exception because
> the supposedly encrypted memory was in fact not encrypted, and thus not
> PVALIDATED. Unfortunately the fix you linked to involved maintaining the
> broken SEV behavior rather than fixing this mismatch.
>
> >
> > > And perhaps dmi_setup() should similarly skip the legacy ROM ranges for
> > > the kernel configs in question?
> >
> > Given (a) non-EFI firmware is supported in other SME/SEV boot code
> > patches [2], (b) this patch does not seem to introduce significant
> > complexity (it just moves [1] to earlier in the boot process to
> > additionally handle the non-EFI case), and (c) skipping
> > probe_roms()+dmi_setup() doesn't work without additional changes, I'm
> > currently still inclined to simply validate the legacy ROM ranges
> > early enough to prevent this issue (as is already done when using EFI
> > firmware).
>
> The 2 options I see are:
>
>   a) Skipping accesses to these regions for SEV. It is vaguely possible
>      some implementation out there actually did measure/load the ROM as
>      part of the initial guest image for SEV, but for SNP this would
>      have been impossible since it would have lead to the guest crashing
>      when snp_prep_roms() was called, since RMPUPDATE on the host only
>      rescinds the validated bit if there is a change to the RMP entry.
>      If it was already assigned/private/validated then the guest code
>      would detected that PVALIDATE resulted in no changes, and so it
>      would have failed with PVALIDATE_FAIL_NOUPDATE. So if you want to
>      be super sure you don't break legacy SEV implementations then you
>      could limit the change to SNP guests where it's essentially
>      guaranteed these regions are not being utilized in any functional
>      way.

Based on your explanation, I agree that (at a minimum) it makes sense
to rectify the behavior for SEV-SNP guests.

On that note, as you describe here, I skipped the 3 ROM region scans
on platforms with CC_ATTR_GUEST_SEV_SNP (and deleted the call to
snp_prep_memory()) and successfully booted. I can send that as v2.

Note that I have *not* tried skipping the scans for all SEV guest
variants (CC_ATTR_GUEST_MEM_ENCRYPT) since those boots appear to be
functioning without the change (and there is a risk of breaking the
sorts of implementations that you described); also note that
clang-built SEV-SNP guests still require [0] and [1] to function.

[0] https://lore.kernel.org/all/20240206223620.1833276-1-acdunlap@google.com/
[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=1c811d403afd73f04bde82b83b24c754011bd0e8

>   b) Modifying the early page table setup by early_make_pgtable() to
>      clear the encrypted bit for 0xC0000-0x100000 legacy region. The
>      challenge there is everything is PMD-mapped at that stage of boot
>      and there's no infrastructure for splitting page tables to handle
>      non-2MB-aligned/sized regions.

If ever needed/desired, a slight variant of this second option might
also be providing a temporary unencrypted mapping on the fly during
the few times the regions are scanned during early boot, similar to
how __sme_early_map_unmap_mem() is already used for sme_map_bootdata()
in head64.c. I haven't tried it, but I just wanted to note it down in
case it becomes relevant.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/kernel: Validate ROM before DMI scanning when SEV-SNP is active
  2024-02-21 22:50       ` Kevin Loughlin
@ 2024-02-22 20:20         ` Michael Roth
  2024-02-22 20:24         ` [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests Kevin Loughlin
  1 sibling, 0 replies; 24+ messages in thread
From: Michael Roth @ 2024-02-22 20:20 UTC (permalink / raw)
  To: Kevin Loughlin
  Cc: Alexander Shishkin, Ard Biesheuvel, Baoquan He, Borislav Petkov,
	Brijesh Singh, Dave Hansen, Dionna Glaze, H.Peter Anvin,
	Ingo Molnar, Josh Poimboeuf, Kai Huang, Peter Zijlstra,
	Ross Lagerwall, Thomas Gleixner, Tom Lendacky, Yuntao Wang,
	linux-kernel, x86, Adam Dunlap, Peter Gonda, Jacob Xu,
	Sidharth Telang, Conrad Grobler, Andri Saar

On Wed, Feb 21, 2024 at 02:50:00PM -0800, Kevin Loughlin wrote:
> On Fri, Feb 16, 2024 at 2:50 PM Michael Roth <michael.roth@amd.com> wrote:
> >
> > On Tue, Feb 13, 2024 at 03:10:46PM -0800, Kevin Loughlin wrote:
> > > On Tue, Feb 13, 2024 at 12:03 PM Michael Roth <michael.roth@amd.com> wrote:
> > > >
> > > > Quoting Kevin Loughlin (2024-02-12 22:07:46)
> > > > > SEV-SNP requires encrypted memory to be validated before access. The
> > > > > kernel is responsible for validating the ROM memory range because the
> > > > > range is not part of the e820 table and therefore not pre-validated by
> > > > > the BIOS.
> > > > >
> > > > > While the current SEV-SNP code attempts to validate the ROM range in
> > > > > probe_roms(), this does not suffice for all existing use cases. In
> > > > > particular, if EFI_CONFIG_TABLES are not enabled and
> > > > > CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
> > > > > attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> > > > > falls in the ROM range) prior to validation. The specific problematic
> > > > > call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> > > > > in a crash during boot if SEV-SNP is enabled under these conditions.
> > > >
> > > > AFAIK, QEMU doesn't actually include any legacy ROMs as part of the initial
> > > > encrypted guest image, and I'm not aware of any VMM implementations that
> > > > do this either.
> > >
> > > I'm using a VMM implementation that uses (non-EFI) Oak stage0 firmware [0].
> > >
> > > [0] https://github.com/project-oak/oak/tree/main/stage0_bin
> > >
> > > > If dmi_setup() similarly scans these ranges, it seems likely the same
> > > > issue would be present: the validated/private regions would only contain
> > > > ciphertext rather than the expected ROM data. Does that agree with the
> > > > behavior you are seeing?
> > > >
> > > > If so, maybe instead probe_roms should just be skipped in the case of SNP?
> > >
> > > If probe_roms() is skipped, SEV-SNP guest boot also currently crashes;
> > > I just quickly tried that (though admittedly haven't looked into why).
> >
> > default_find_smp_config() will also call smp_scan_config() on
> > 0xF0000-0x10000, so that might be the additional issue you're hitting.
> > If I skip that for in addition to probe_roms, then boot works for me.
> 
> Yeah, smp_scan_config() was the culprit. Thanks.
> 
> > It seems the currently handling has a bug that has been in place since the
> > original SEV guest code was added. If you dump the data that probe_roms()
> > sees while it is scanning for instances of ROMSIGNATURE (0xaa55) in the
> > region, you'll see that it is random data that changes on every boot.
> > The root issue is that this region does not contain encrypted data, and
> > is only being accessed that way because the early page table has the
> > encryption bit set for this range.
> >
> > The effects are subtle: if the code ever sees a pair of bytes that look
> > like ROMSIGNATURE, it will reserve that memory so it can be accessed
> > later, generally just 0xc0000-0xc7fff. In extremely rare cases where the
> > ciphertext's data has a checksum that happens to match the contents, it
> > will use a random byte, multiple it by 512, and reserve up to 64k for
> > this bogus ROM region.
> >
> > For SNP this resulted in a more obvious failure: a #VC exception because
> > the supposedly encrypted memory was in fact not encrypted, and thus not
> > PVALIDATED. Unfortunately the fix you linked to involved maintaining the
> > broken SEV behavior rather than fixing this mismatch.
> >
> > >
> > > > And perhaps dmi_setup() should similarly skip the legacy ROM ranges for
> > > > the kernel configs in question?
> > >
> > > Given (a) non-EFI firmware is supported in other SME/SEV boot code
> > > patches [2], (b) this patch does not seem to introduce significant
> > > complexity (it just moves [1] to earlier in the boot process to
> > > additionally handle the non-EFI case), and (c) skipping
> > > probe_roms()+dmi_setup() doesn't work without additional changes, I'm
> > > currently still inclined to simply validate the legacy ROM ranges
> > > early enough to prevent this issue (as is already done when using EFI
> > > firmware).
> >
> > The 2 options I see are:
> >
> >   a) Skipping accesses to these regions for SEV. It is vaguely possible
> >      some implementation out there actually did measure/load the ROM as
> >      part of the initial guest image for SEV, but for SNP this would
> >      have been impossible since it would have lead to the guest crashing
> >      when snp_prep_roms() was called, since RMPUPDATE on the host only
> >      rescinds the validated bit if there is a change to the RMP entry.
> >      If it was already assigned/private/validated then the guest code
> >      would detected that PVALIDATE resulted in no changes, and so it
> >      would have failed with PVALIDATE_FAIL_NOUPDATE. So if you want to
> >      be super sure you don't break legacy SEV implementations then you
> >      could limit the change to SNP guests where it's essentially
> >      guaranteed these regions are not being utilized in any functional
> >      way.
> 
> Based on your explanation, I agree that (at a minimum) it makes sense
> to rectify the behavior for SEV-SNP guests.
> 
> On that note, as you describe here, I skipped the 3 ROM region scans
> on platforms with CC_ATTR_GUEST_SEV_SNP (and deleted the call to
> snp_prep_memory()) and successfully booted. I can send that as v2.

Sounds good. Please add me to the Cc, happy to test/review.

> 
> Note that I have *not* tried skipping the scans for all SEV guest
> variants (CC_ATTR_GUEST_MEM_ENCRYPT) since those boots appear to be
> functioning without the change (and there is a risk of breaking the
> sorts of implementations that you described); also note that
> clang-built SEV-SNP guests still require [0] and [1] to function.
> 
> [0] https://lore.kernel.org/all/20240206223620.1833276-1-acdunlap@google.com/
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=1c811d403afd73f04bde82b83b24c754011bd0e8
> 
> >   b) Modifying the early page table setup by early_make_pgtable() to
> >      clear the encrypted bit for 0xC0000-0x100000 legacy region. The
> >      challenge there is everything is PMD-mapped at that stage of boot
> >      and there's no infrastructure for splitting page tables to handle
> >      non-2MB-aligned/sized regions.
> 
> If ever needed/desired, a slight variant of this second option might
> also be providing a temporary unencrypted mapping on the fly during
> the few times the regions are scanned during early boot, similar to
> how __sme_early_map_unmap_mem() is already used for sme_map_bootdata()
> in head64.c. I haven't tried it, but I just wanted to note it down in
> case it becomes relevant.

True, that might be another option to consider if needed.

-Mike

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests
  2024-02-21 22:50       ` Kevin Loughlin
  2024-02-22 20:20         ` Michael Roth
@ 2024-02-22 20:24         ` Kevin Loughlin
  2024-02-26 19:15           ` Mike Stunes
                             ` (2 more replies)
  1 sibling, 3 replies; 24+ messages in thread
From: Kevin Loughlin @ 2024-02-22 20:24 UTC (permalink / raw)
  To: kevinloughlin
  Cc: acdunlap, alexander.shishkin, andrisaar, ardb, bhe, bp,
	brijesh.singh, dave.hansen, dionnaglaze, grobler, hpa, jacobhxu,
	jpoimboe, kai.huang, linux-kernel, michael.roth, mingo, peterz,
	pgonda, ross.lagerwall, sidtelang, tglx, thomas.lendacky, x86,
	ytcoode

SEV-SNP requires encrypted memory to be validated before access.
Because the ROM memory range is not part of the e820 table, it is not
pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
to access this range, the guest must first validate the range.

The current SEV-SNP code does indeed scan the ROM range during early
boot and thus attempts to validate the ROM range in probe_roms().
However, this behavior is neither necessary nor sufficient.

With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
falls in the ROM range) prior to validation. The specific problematic
call chain occurs during dmi_setup() -> dmi_scan_machine() and results
in a crash during boot if SEV-SNP is enabled under these conditions.

With regards to necessity, SEV-SNP guests currently read garbage (which
changes across boots) from the ROM range, meaning these scans are
unnecessary. The guest reads garbage because the legacy ROM range
is unencrypted data but is accessed via an encrypted PMD during early
boot (where the PMD is marked as encrypted due to potentially mapping
actually-encrypted data in other PMD-contained ranges).

While one solution would be to overhaul the early PMD mapping to treat
the ROM region of the PMD as unencrypted, SEV-SNP guests do not rely on
data from the legacy ROM region during early boot (nor can they
currently, since the data would be garbage that changes across boots).
As such, this patch opts for the simpler approach of skipping the ROM
range scans (and the otherwise-necessary range validation) during
SEV-SNP guest early boot.

Ultimatly, the potential SEV-SNP guest crash due to lack of ROM range
validation is avoided by simply not accessing the ROM range.

Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
---
 arch/x86/include/asm/sev.h   |  2 --
 arch/x86/kernel/mpparse.c    |  7 +++++++
 arch/x86/kernel/probe_roms.c | 11 ++++-------
 arch/x86/kernel/sev.c        | 15 ---------------
 drivers/firmware/dmi_scan.c  |  7 ++++++-
 5 files changed, 17 insertions(+), 25 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 5b4a1ce3d368..474c24ba0f6f 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -203,7 +203,6 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
 					 unsigned long npages);
 void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
 					unsigned long npages);
-void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
 void snp_set_memory_shared(unsigned long vaddr, unsigned long npages);
 void snp_set_memory_private(unsigned long vaddr, unsigned long npages);
 void snp_set_wakeup_secondary_cpu(void);
@@ -227,7 +226,6 @@ static inline void __init
 early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
 static inline void __init
 early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
-static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
 static inline void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) { }
 static inline void snp_set_memory_private(unsigned long vaddr, unsigned long npages) { }
 static inline void snp_set_wakeup_secondary_cpu(void) { }
diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
index b223922248e9..39ea771e2d4c 100644
--- a/arch/x86/kernel/mpparse.c
+++ b/arch/x86/kernel/mpparse.c
@@ -553,6 +553,13 @@ static int __init smp_scan_config(unsigned long base, unsigned long length)
 		    base, base + length - 1);
 	BUILD_BUG_ON(sizeof(*mpf) != 16);
 
+	/*
+	 * Skip scan in SEV-SNP guest if it would touch the legacy ROM region,
+	 * as this memory is not pre-validated and would thus cause a crash.
+	 */
+	if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && base < 0x100000 && base + length >= 0xC0000)
+		return 0;
+
 	while (length > 0) {
 		bp = early_memremap(base, length);
 		mpf = (struct mpf_intel *)bp;
diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
index 319fef37d9dc..84ff4b052fc1 100644
--- a/arch/x86/kernel/probe_roms.c
+++ b/arch/x86/kernel/probe_roms.c
@@ -204,14 +204,11 @@ void __init probe_roms(void)
 	int i;
 
 	/*
-	 * The ROM memory range is not part of the e820 table and is therefore not
-	 * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted
-	 * memory, and SNP requires encrypted memory to be validated before access.
-	 * Do that here.
+	 * These probes are skipped in SEV-SNP guests because the ROM range
+	 * is not pre-validated, meaning access would cause a crash.
 	 */
-	snp_prep_memory(video_rom_resource.start,
-			((system_rom_resource.end + 1) - video_rom_resource.start),
-			SNP_PAGE_STATE_PRIVATE);
+	if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+		return;
 
 	/* video rom */
 	upper = adapter_rom_resources[0].start;
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index c67285824e82..d2362631da91 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -774,21 +774,6 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
 	early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED);
 }
 
-void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
-{
-	unsigned long vaddr, npages;
-
-	vaddr = (unsigned long)__va(paddr);
-	npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
-
-	if (op == SNP_PAGE_STATE_PRIVATE)
-		early_snp_set_memory_private(vaddr, paddr, npages);
-	else if (op == SNP_PAGE_STATE_SHARED)
-		early_snp_set_memory_shared(vaddr, paddr, npages);
-	else
-		WARN(1, "invalid memory op %d\n", op);
-}
-
 static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
 				       unsigned long vaddr_end, int op)
 {
diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
index 015c95a825d3..22e27087eb5b 100644
--- a/drivers/firmware/dmi_scan.c
+++ b/drivers/firmware/dmi_scan.c
@@ -703,7 +703,12 @@ static void __init dmi_scan_machine(void)
 			dmi_available = 1;
 			return;
 		}
-	} else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK)) {
+	} else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK) &&
+		!cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
+		/*
+		 * This scan is skipped in SEV-SNP guests because the ROM range
+		 * is not pre-validated, meaning access would cause a crash.
+		 */
 		p = dmi_early_remap(SMBIOS_ENTRY_POINT_SCAN_START, 0x10000);
 		if (p == NULL)
 			goto error;
-- 
2.44.0.rc0.258.g7320e95886-goog


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests
  2024-02-22 20:24         ` [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests Kevin Loughlin
@ 2024-02-26 19:15           ` Mike Stunes
  2024-03-08 16:10             ` Kevin Loughlin
  2024-02-29 16:54           ` Borislav Petkov
  2024-03-08 10:30           ` Ard Biesheuvel
  2 siblings, 1 reply; 24+ messages in thread
From: Mike Stunes @ 2024-02-26 19:15 UTC (permalink / raw)
  To: Kevin Loughlin
  Cc: acdunlap, alexander.shishkin, andrisaar, ardb, bhe,
	Borislav Petkov, Brijesh Singh, dave.hansen, Dionna Amalie Glaze,
	grobler, hpa, jacobhxu, jpoimboe, Huang, Kai, linux-kernel,
	Michael Roth, mingo, peterz, Peter Gonda, ross.lagerwall,
	sidtelang, tglx, Thomas Lendacky, x86, ytcoode

Hi,

> On Feb 22, 2024, at 12:24 PM, Kevin Loughlin <kevinloughlin@google.com> wrote:
> 
> SEV-SNP requires encrypted memory to be validated before access.
> Because the ROM memory range is not part of the e820 table, it is not
> pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
> to access this range, the guest must first validate the range.
> 
> The current SEV-SNP code does indeed scan the ROM range during early
> boot and thus attempts to validate the ROM range in probe_roms().
> However, this behavior is neither necessary nor sufficient.
> 
> With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
> CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
> attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> falls in the ROM range) prior to validation. The specific problematic
> call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> in a crash during boot if SEV-SNP is enabled under these conditions.
> 
> With regards to necessity, SEV-SNP guests currently read garbage (which
> changes across boots) from the ROM range, meaning these scans are
> unnecessary. The guest reads garbage because the legacy ROM range
> is unencrypted data but is accessed via an encrypted PMD during early
> boot (where the PMD is marked as encrypted due to potentially mapping
> actually-encrypted data in other PMD-contained ranges).
> 
> While one solution would be to overhaul the early PMD mapping to treat
> the ROM region of the PMD as unencrypted, SEV-SNP guests do not rely on
> data from the legacy ROM region during early boot (nor can they
> currently, since the data would be garbage that changes across boots).
> As such, this patch opts for the simpler approach of skipping the ROM
> range scans (and the otherwise-necessary range validation) during
> SEV-SNP guest early boot.
> 
> Ultimatly, the potential SEV-SNP guest crash due to lack of ROM range
> validation is avoided by simply not accessing the ROM range.
> 
> Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
> Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
> ---
> arch/x86/include/asm/sev.h   |  2 --
> arch/x86/kernel/mpparse.c    |  7 +++++++
> arch/x86/kernel/probe_roms.c | 11 ++++-------
> arch/x86/kernel/sev.c        | 15 ---------------
> drivers/firmware/dmi_scan.c  |  7 ++++++-
> 5 files changed, 17 insertions(+), 25 deletions(-)
> 
> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> index 5b4a1ce3d368..474c24ba0f6f 100644
> --- a/arch/x86/include/asm/sev.h
> +++ b/arch/x86/include/asm/sev.h
> @@ -203,7 +203,6 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
> unsigned long npages);
> void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
> unsigned long npages);
> -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
> void snp_set_memory_shared(unsigned long vaddr, unsigned long npages);
> void snp_set_memory_private(unsigned long vaddr, unsigned long npages);
> void snp_set_wakeup_secondary_cpu(void);
> @@ -227,7 +226,6 @@ static inline void __init
> early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> static inline void __init
> early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> -static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
> static inline void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) { }
> static inline void snp_set_memory_private(unsigned long vaddr, unsigned long npages) { }
> static inline void snp_set_wakeup_secondary_cpu(void) { }
> diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
> index b223922248e9..39ea771e2d4c 100644
> --- a/arch/x86/kernel/mpparse.c
> +++ b/arch/x86/kernel/mpparse.c
> @@ -553,6 +553,13 @@ static int __init smp_scan_config(unsigned long base, unsigned long length)
>    base, base + length - 1);
> BUILD_BUG_ON(sizeof(*mpf) != 16);
> 
> + /*
> + * Skip scan in SEV-SNP guest if it would touch the legacy ROM region,
> + * as this memory is not pre-validated and would thus cause a crash.
> + */
> + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && base < 0x100000 && base + length >= 0xC0000)
> + return 0;
> +
> while (length > 0) {
> bp = early_memremap(base, length);
> mpf = (struct mpf_intel *)bp;
> diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
> index 319fef37d9dc..84ff4b052fc1 100644
> --- a/arch/x86/kernel/probe_roms.c
> +++ b/arch/x86/kernel/probe_roms.c
> @@ -204,14 +204,11 @@ void __init probe_roms(void)
> int i;
> 
> /*
> - * The ROM memory range is not part of the e820 table and is therefore not
> - * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted
> - * memory, and SNP requires encrypted memory to be validated before access.
> - * Do that here.
> + * These probes are skipped in SEV-SNP guests because the ROM range
> + * is not pre-validated, meaning access would cause a crash.
> */
> - snp_prep_memory(video_rom_resource.start,
> - ((system_rom_resource.end + 1) - video_rom_resource.start),
> - SNP_PAGE_STATE_PRIVATE);
> + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> + return;
> 
> /* video rom */
> upper = adapter_rom_resources[0].start;
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index c67285824e82..d2362631da91 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -774,21 +774,6 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
> early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED);
> }
> 
> -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
> -{
> - unsigned long vaddr, npages;
> -
> - vaddr = (unsigned long)__va(paddr);
> - npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
> -
> - if (op == SNP_PAGE_STATE_PRIVATE)
> - early_snp_set_memory_private(vaddr, paddr, npages);
> - else if (op == SNP_PAGE_STATE_SHARED)
> - early_snp_set_memory_shared(vaddr, paddr, npages);
> - else
> - WARN(1, "invalid memory op %d\n", op);
> -}
> -
> static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
>       unsigned long vaddr_end, int op)
> {
> diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
> index 015c95a825d3..22e27087eb5b 100644
> --- a/drivers/firmware/dmi_scan.c
> +++ b/drivers/firmware/dmi_scan.c
> @@ -703,7 +703,12 @@ static void __init dmi_scan_machine(void)
> dmi_available = 1;
> return;
> }
> - } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK)) {
> + } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK) &&
> + !cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
> + /*
> + * This scan is skipped in SEV-SNP guests because the ROM range
> + * is not pre-validated, meaning access would cause a crash.
> + */
> p = dmi_early_remap(SMBIOS_ENTRY_POINT_SCAN_START, 0x10000);
> if (p == NULL)
> goto error;
> -- 
> 2.44.0.rc0.258.g7320e95886-goog
> 
> 

In addition to these changes, I also had to skip pirq_find_routing_table if SEV-SNP is active.

Thanks!
Mike


-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests
  2024-02-22 20:24         ` [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests Kevin Loughlin
  2024-02-26 19:15           ` Mike Stunes
@ 2024-02-29 16:54           ` Borislav Petkov
  2024-03-08 16:14             ` Kevin Loughlin
  2024-03-08 10:30           ` Ard Biesheuvel
  2 siblings, 1 reply; 24+ messages in thread
From: Borislav Petkov @ 2024-02-29 16:54 UTC (permalink / raw)
  To: Kevin Loughlin
  Cc: acdunlap, alexander.shishkin, andrisaar, ardb, bhe,
	brijesh.singh, dave.hansen, dionnaglaze, grobler, hpa, jacobhxu,
	jpoimboe, kai.huang, linux-kernel, michael.roth, mingo, peterz,
	pgonda, ross.lagerwall, sidtelang, tglx, thomas.lendacky, x86,
	ytcoode

On Thu, Feb 22, 2024 at 08:24:04PM +0000, Kevin Loughlin wrote:
> SEV-SNP requires encrypted memory to be validated before access.
> Because the ROM memory range is not part of the e820 table, it is not
> pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
> to access this range, the guest must first validate the range.
> 
> The current SEV-SNP code does indeed scan the ROM range during early
> boot and thus attempts to validate the ROM range in probe_roms().
> However, this behavior is neither necessary nor sufficient.

Why is this not necessary, all of a sudden?

> With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
> CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will

What is that use case exactly?

CONFIG_DMI_... is usually enabled but the absence of EFI_CONFIG_TABLES
tells me that you're booting some guest with some special OVMF which
doesn't sport such tables.

Why?

/me scrolls upthread

Aha, some project oak thing doing a minimal fw. I can see why but this
should be explained here as to why is this a relevant use case and what
it is using and so on so that future readers can piece it all together.

> attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> falls in the ROM range) prior to validation. The specific problematic
> call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> in a crash during boot if SEV-SNP is enabled under these conditions.
> 
> With regards to necessity, SEV-SNP guests currently read garbage (which
> changes across boots) from the ROM range, meaning these scans are
> unnecessary. The guest reads garbage because the legacy ROM range
> is unencrypted data but is accessed via an encrypted PMD during early
> boot (where the PMD is marked as encrypted due to potentially mapping
> actually-encrypted data in other PMD-contained ranges).

I don't mind ripping that ROM probing thing but that thread we're on
here talks more about why it could be problematic to keep doing so so
pls summarize that here.

A commit should contain all arguments for why it has been arrived at
the decision to do it this way.

> While one solution would be to overhaul the early PMD mapping to treat
> the ROM region of the PMD as unencrypted, SEV-SNP guests do not rely on
> data from the legacy ROM region during early boot (nor can they
> currently, since the data would be garbage that changes across boots).

That's better.

> As such, this patch opts for the simpler approach of skipping the ROM

Avoid having "This patch" or "This commit" in the commit message. It is
tautologically useless.

Also, do

$ git grep 'This patch' Documentation/process

for more details.

> range scans (and the otherwise-necessary range validation) during
> SEV-SNP guest early boot.
> 
> Ultimatly, the potential SEV-SNP guest crash due to lack of ROM range
  ^^^^^^^^^^

Please introduce a spellchecker into your patch creation workflow.

> validation is avoided by simply not accessing the ROM range.
> 
> Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
> Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
> ---
>  arch/x86/include/asm/sev.h   |  2 --
>  arch/x86/kernel/mpparse.c    |  7 +++++++
>  arch/x86/kernel/probe_roms.c | 11 ++++-------
>  arch/x86/kernel/sev.c        | 15 ---------------
>  drivers/firmware/dmi_scan.c  |  7 ++++++-
>  5 files changed, 17 insertions(+), 25 deletions(-)

...

> diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
> index b223922248e9..39ea771e2d4c 100644
> --- a/arch/x86/kernel/mpparse.c
> +++ b/arch/x86/kernel/mpparse.c
> @@ -553,6 +553,13 @@ static int __init smp_scan_config(unsigned long base, unsigned long length)
>  		    base, base + length - 1);
>  	BUILD_BUG_ON(sizeof(*mpf) != 16);
>  
> +	/*
> +	 * Skip scan in SEV-SNP guest if it would touch the legacy ROM region,
> +	 * as this memory is not pre-validated and would thus cause a crash.
> +	 */
> +	if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && base < 0x100000 && base + length >= 0xC0000)
> +		return 0;

I don't like spreading around CoCo checks everywhere around the tree.

Think of a better way pls.

> diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
> index 015c95a825d3..22e27087eb5b 100644
> --- a/drivers/firmware/dmi_scan.c
> +++ b/drivers/firmware/dmi_scan.c
> @@ -703,7 +703,12 @@ static void __init dmi_scan_machine(void)
>  			dmi_available = 1;
>  			return;
>  		}
> -	} else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK)) {
> +	} else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK) &&
> +		!cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {

Ditto.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests
  2024-02-22 20:24         ` [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests Kevin Loughlin
  2024-02-26 19:15           ` Mike Stunes
  2024-02-29 16:54           ` Borislav Petkov
@ 2024-03-08 10:30           ` Ard Biesheuvel
  2024-03-08 11:00             ` Borislav Petkov
  2024-03-08 20:50             ` Kevin Loughlin
  2 siblings, 2 replies; 24+ messages in thread
From: Ard Biesheuvel @ 2024-03-08 10:30 UTC (permalink / raw)
  To: Kevin Loughlin
  Cc: acdunlap, alexander.shishkin, andrisaar, bhe, bp, brijesh.singh,
	dave.hansen, dionnaglaze, grobler, hpa, jacobhxu, jpoimboe,
	kai.huang, linux-kernel, michael.roth, mingo, peterz, pgonda,
	ross.lagerwall, sidtelang, tglx, thomas.lendacky, x86, ytcoode

On Thu, 22 Feb 2024 at 21:25, Kevin Loughlin <kevinloughlin@google.com> wrote:
>
> SEV-SNP requires encrypted memory to be validated before access.
> Because the ROM memory range is not part of the e820 table, it is not
> pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
> to access this range, the guest must first validate the range.
>
> The current SEV-SNP code does indeed scan the ROM range during early
> boot and thus attempts to validate the ROM range in probe_roms().
> However, this behavior is neither necessary nor sufficient.
>
> With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
> CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
> attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> falls in the ROM range) prior to validation. The specific problematic
> call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> in a crash during boot if SEV-SNP is enabled under these conditions.
>
> With regards to necessity, SEV-SNP guests currently read garbage (which
> changes across boots) from the ROM range, meaning these scans are
> unnecessary. The guest reads garbage because the legacy ROM range
> is unencrypted data but is accessed via an encrypted PMD during early
> boot (where the PMD is marked as encrypted due to potentially mapping
> actually-encrypted data in other PMD-contained ranges).
>
> While one solution would be to overhaul the early PMD mapping to treat
> the ROM region of the PMD as unencrypted, SEV-SNP guests do not rely on
> data from the legacy ROM region during early boot (nor can they
> currently, since the data would be garbage that changes across boots).
> As such, this patch opts for the simpler approach of skipping the ROM
> range scans (and the otherwise-necessary range validation) during
> SEV-SNP guest early boot.
>
> Ultimatly, the potential SEV-SNP guest crash due to lack of ROM range
> validation is avoided by simply not accessing the ROM range.
>
> Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
> Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
> ---
>  arch/x86/include/asm/sev.h   |  2 --
>  arch/x86/kernel/mpparse.c    |  7 +++++++
>  arch/x86/kernel/probe_roms.c | 11 ++++-------
>  arch/x86/kernel/sev.c        | 15 ---------------
>  drivers/firmware/dmi_scan.c  |  7 ++++++-
>  5 files changed, 17 insertions(+), 25 deletions(-)
>

Agree with the analysis and the conclusion. However, this will need to
be split into generic and x86 specific changes, given that the DMI
code is shared between all architectures, and explicitly checking for
SEV-SNP support in generic code is not appropriate.

So what we will need is:
- a generic change that implements a static inline wrapper around
IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK), and wires it up
in  drivers/firmware/dmi_scan.c;
- a x86 specific change that overrides this DMI helper in terms of
cc_platform_has(CC_ATTR_GUEST_SEV_SNP);
- x86 specific changes that deal with the other scanning

Note that this means that Oak based platforms will lose DMI reporting
and DMI-based quirks, but I think this is reasonable.

More feedback below.


> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> index 5b4a1ce3d368..474c24ba0f6f 100644
> --- a/arch/x86/include/asm/sev.h
> +++ b/arch/x86/include/asm/sev.h
> @@ -203,7 +203,6 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
>                                          unsigned long npages);
>  void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
>                                         unsigned long npages);
> -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
>  void snp_set_memory_shared(unsigned long vaddr, unsigned long npages);
>  void snp_set_memory_private(unsigned long vaddr, unsigned long npages);
>  void snp_set_wakeup_secondary_cpu(void);
> @@ -227,7 +226,6 @@ static inline void __init
>  early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
>  static inline void __init
>  early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> -static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
>  static inline void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) { }
>  static inline void snp_set_memory_private(unsigned long vaddr, unsigned long npages) { }
>  static inline void snp_set_wakeup_secondary_cpu(void) { }
> diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
> index b223922248e9..39ea771e2d4c 100644
> --- a/arch/x86/kernel/mpparse.c
> +++ b/arch/x86/kernel/mpparse.c
> @@ -553,6 +553,13 @@ static int __init smp_scan_config(unsigned long base, unsigned long length)
>                     base, base + length - 1);
>         BUILD_BUG_ON(sizeof(*mpf) != 16);
>
> +       /*
> +        * Skip scan in SEV-SNP guest if it would touch the legacy ROM region,
> +        * as this memory is not pre-validated and would thus cause a crash.
> +        */
> +       if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && base < 0x100000 && base + length >= 0xC0000)
> +               return 0;
> +

Please don't use magic numbers like this, and use memory_intersects()
[unless there is a reason to avoid it which I missed]

Also, really?!? Does modern x86 still rely on scanning arbitrary
regions of memory for magic numbers? Or is this only for those who
prefer vintage boot protocols?

If so, I suppose we might need a generic helper

static inline bool platform_allows_memory_probing(void)

[modulo bikeshedding over the name] where the generic implementation
returns false, and the x86 implementation could take
cc_platform_has(CC_ATTR_GUEST_SEV_SNP) into account, and return true
otherwise.

(On ARM based systems, memory probing is never ok, because the memory
map is not architected, and so probing random addresses might bring
down the machine)

>         while (length > 0) {
>                 bp = early_memremap(base, length);
>                 mpf = (struct mpf_intel *)bp;
> diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
> index 319fef37d9dc..84ff4b052fc1 100644
> --- a/arch/x86/kernel/probe_roms.c
> +++ b/arch/x86/kernel/probe_roms.c
> @@ -204,14 +204,11 @@ void __init probe_roms(void)
>         int i;
>
>         /*
> -        * The ROM memory range is not part of the e820 table and is therefore not
> -        * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted
> -        * memory, and SNP requires encrypted memory to be validated before access.
> -        * Do that here.
> +        * These probes are skipped in SEV-SNP guests because the ROM range
> +        * is not pre-validated, meaning access would cause a crash.
>          */
> -       snp_prep_memory(video_rom_resource.start,
> -                       ((system_rom_resource.end + 1) - video_rom_resource.start),
> -                       SNP_PAGE_STATE_PRIVATE);
> +       if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> +               return;
>
>         /* video rom */
>         upper = adapter_rom_resources[0].start;
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index c67285824e82..d2362631da91 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -774,21 +774,6 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
>         early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED);
>  }
>
> -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
> -{
> -       unsigned long vaddr, npages;
> -
> -       vaddr = (unsigned long)__va(paddr);
> -       npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
> -
> -       if (op == SNP_PAGE_STATE_PRIVATE)
> -               early_snp_set_memory_private(vaddr, paddr, npages);
> -       else if (op == SNP_PAGE_STATE_SHARED)
> -               early_snp_set_memory_shared(vaddr, paddr, npages);
> -       else
> -               WARN(1, "invalid memory op %d\n", op);
> -}
> -
>  static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
>                                        unsigned long vaddr_end, int op)
>  {
> diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
> index 015c95a825d3..22e27087eb5b 100644
> --- a/drivers/firmware/dmi_scan.c
> +++ b/drivers/firmware/dmi_scan.c
> @@ -703,7 +703,12 @@ static void __init dmi_scan_machine(void)
>                         dmi_available = 1;
>                         return;
>                 }
> -       } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK)) {
> +       } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK) &&
> +               !cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
> +               /*
> +                * This scan is skipped in SEV-SNP guests because the ROM range
> +                * is not pre-validated, meaning access would cause a crash.
> +                */
>                 p = dmi_early_remap(SMBIOS_ENTRY_POINT_SCAN_START, 0x10000);
>                 if (p == NULL)
>                         goto error;
> --
> 2.44.0.rc0.258.g7320e95886-goog
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests
  2024-03-08 10:30           ` Ard Biesheuvel
@ 2024-03-08 11:00             ` Borislav Petkov
  2024-03-08 11:44               ` Ard Biesheuvel
  2024-03-08 20:52               ` [PATCH v2] x86/kernel: skip " Kevin Loughlin
  2024-03-08 20:50             ` Kevin Loughlin
  1 sibling, 2 replies; 24+ messages in thread
From: Borislav Petkov @ 2024-03-08 11:00 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Kevin Loughlin, acdunlap, alexander.shishkin, andrisaar, bhe,
	brijesh.singh, dave.hansen, dionnaglaze, grobler, hpa, jacobhxu,
	jpoimboe, kai.huang, linux-kernel, michael.roth, mingo, peterz,
	pgonda, ross.lagerwall, sidtelang, tglx, thomas.lendacky, x86,
	ytcoode

On Fri, Mar 08, 2024 at 11:30:50AM +0100, Ard Biesheuvel wrote:
> Agree with the analysis and the conclusion. However, this will need to
> be split into generic and x86 specific changes, given that the DMI
> code is shared between all architectures, and explicitly checking for
> SEV-SNP support in generic code is not appropriate.
> 
> So what we will need is:

I was actually thinking of:

	x86_init.resources.probe_roms = snp_probe_roms;

and snp_probe_roms() is an empty stub.

Problem solved without ugly sprinkling of checks everywhere.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests
  2024-03-08 11:00             ` Borislav Petkov
@ 2024-03-08 11:44               ` Ard Biesheuvel
  2024-03-10 17:12                 ` Kevin Loughlin
  2024-03-08 20:52               ` [PATCH v2] x86/kernel: skip " Kevin Loughlin
  1 sibling, 1 reply; 24+ messages in thread
From: Ard Biesheuvel @ 2024-03-08 11:44 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kevin Loughlin, acdunlap, alexander.shishkin, andrisaar, bhe,
	brijesh.singh, dave.hansen, dionnaglaze, grobler, hpa, jacobhxu,
	jpoimboe, kai.huang, linux-kernel, michael.roth, mingo, peterz,
	pgonda, ross.lagerwall, sidtelang, tglx, thomas.lendacky, x86,
	ytcoode

On Fri, 8 Mar 2024 at 12:01, Borislav Petkov <bp@alien8.de> wrote:
>
> On Fri, Mar 08, 2024 at 11:30:50AM +0100, Ard Biesheuvel wrote:
> > Agree with the analysis and the conclusion. However, this will need to
> > be split into generic and x86 specific changes, given that the DMI
> > code is shared between all architectures, and explicitly checking for
> > SEV-SNP support in generic code is not appropriate.
> >
> > So what we will need is:
>
> I was actually thinking of:
>
>         x86_init.resources.probe_roms = snp_probe_roms;
>
> and snp_probe_roms() is an empty stub.
>
> Problem solved without ugly sprinkling of checks everywhere.
>

Indeed. Setting the override could be done in
init_hypervisor_platform(), which is called right before from
setup_arch().

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests
  2024-02-26 19:15           ` Mike Stunes
@ 2024-03-08 16:10             ` Kevin Loughlin
  2024-03-08 20:41               ` Michael Roth
  0 siblings, 1 reply; 24+ messages in thread
From: Kevin Loughlin @ 2024-03-08 16:10 UTC (permalink / raw)
  To: Mike Stunes
  Cc: acdunlap, alexander.shishkin, andrisaar, ardb, bhe,
	Borislav Petkov, Brijesh Singh, dave.hansen, Dionna Amalie Glaze,
	grobler, hpa, jacobhxu, jpoimboe, Huang, Kai, linux-kernel,
	Michael Roth, mingo, peterz, Peter Gonda, ross.lagerwall,
	sidtelang, tglx, Thomas Lendacky, x86, ytcoode

On Mon, Feb 26, 2024 at 2:16 PM Mike Stunes <mike.stunes@broadcom.com> wrote:
>
> Hi,
>
> > On Feb 22, 2024, at 12:24 PM, Kevin Loughlin <kevinloughlin@google.com> wrote:
> >
> > SEV-SNP requires encrypted memory to be validated before access.
> > Because the ROM memory range is not part of the e820 table, it is not
> > pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
> > to access this range, the guest must first validate the range.
> >
> > The current SEV-SNP code does indeed scan the ROM range during early
> > boot and thus attempts to validate the ROM range in probe_roms().
> > However, this behavior is neither necessary nor sufficient.
> >
> > With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
> > CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
> > attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> > falls in the ROM range) prior to validation. The specific problematic
> > call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> > in a crash during boot if SEV-SNP is enabled under these conditions.
> >
> > With regards to necessity, SEV-SNP guests currently read garbage (which
> > changes across boots) from the ROM range, meaning these scans are
> > unnecessary. The guest reads garbage because the legacy ROM range
> > is unencrypted data but is accessed via an encrypted PMD during early
> > boot (where the PMD is marked as encrypted due to potentially mapping
> > actually-encrypted data in other PMD-contained ranges).
> >
> > While one solution would be to overhaul the early PMD mapping to treat
> > the ROM region of the PMD as unencrypted, SEV-SNP guests do not rely on
> > data from the legacy ROM region during early boot (nor can they
> > currently, since the data would be garbage that changes across boots).
> > As such, this patch opts for the simpler approach of skipping the ROM
> > range scans (and the otherwise-necessary range validation) during
> > SEV-SNP guest early boot.
> >
> > Ultimatly, the potential SEV-SNP guest crash due to lack of ROM range
> > validation is avoided by simply not accessing the ROM range.
> >
> > Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
> > Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
> > ---
> > arch/x86/include/asm/sev.h   |  2 --
> > arch/x86/kernel/mpparse.c    |  7 +++++++
> > arch/x86/kernel/probe_roms.c | 11 ++++-------
> > arch/x86/kernel/sev.c        | 15 ---------------
> > drivers/firmware/dmi_scan.c  |  7 ++++++-
> > 5 files changed, 17 insertions(+), 25 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> > index 5b4a1ce3d368..474c24ba0f6f 100644
> > --- a/arch/x86/include/asm/sev.h
> > +++ b/arch/x86/include/asm/sev.h
> > @@ -203,7 +203,6 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
> > unsigned long npages);
> > void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
> > unsigned long npages);
> > -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
> > void snp_set_memory_shared(unsigned long vaddr, unsigned long npages);
> > void snp_set_memory_private(unsigned long vaddr, unsigned long npages);
> > void snp_set_wakeup_secondary_cpu(void);
> > @@ -227,7 +226,6 @@ static inline void __init
> > early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> > static inline void __init
> > early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> > -static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
> > static inline void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) { }
> > static inline void snp_set_memory_private(unsigned long vaddr, unsigned long npages) { }
> > static inline void snp_set_wakeup_secondary_cpu(void) { }
> > diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
> > index b223922248e9..39ea771e2d4c 100644
> > --- a/arch/x86/kernel/mpparse.c
> > +++ b/arch/x86/kernel/mpparse.c
> > @@ -553,6 +553,13 @@ static int __init smp_scan_config(unsigned long base, unsigned long length)
> >    base, base + length - 1);
> > BUILD_BUG_ON(sizeof(*mpf) != 16);
> >
> > + /*
> > + * Skip scan in SEV-SNP guest if it would touch the legacy ROM region,
> > + * as this memory is not pre-validated and would thus cause a crash.
> > + */
> > + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && base < 0x100000 && base + length >= 0xC0000)
> > + return 0;
> > +
> > while (length > 0) {
> > bp = early_memremap(base, length);
> > mpf = (struct mpf_intel *)bp;
> > diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
> > index 319fef37d9dc..84ff4b052fc1 100644
> > --- a/arch/x86/kernel/probe_roms.c
> > +++ b/arch/x86/kernel/probe_roms.c
> > @@ -204,14 +204,11 @@ void __init probe_roms(void)
> > int i;
> >
> > /*
> > - * The ROM memory range is not part of the e820 table and is therefore not
> > - * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted
> > - * memory, and SNP requires encrypted memory to be validated before access.
> > - * Do that here.
> > + * These probes are skipped in SEV-SNP guests because the ROM range
> > + * is not pre-validated, meaning access would cause a crash.
> > */
> > - snp_prep_memory(video_rom_resource.start,
> > - ((system_rom_resource.end + 1) - video_rom_resource.start),
> > - SNP_PAGE_STATE_PRIVATE);
> > + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> > + return;
> >
> > /* video rom */
> > upper = adapter_rom_resources[0].start;
> > diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> > index c67285824e82..d2362631da91 100644
> > --- a/arch/x86/kernel/sev.c
> > +++ b/arch/x86/kernel/sev.c
> > @@ -774,21 +774,6 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
> > early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED);
> > }
> >
> > -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
> > -{
> > - unsigned long vaddr, npages;
> > -
> > - vaddr = (unsigned long)__va(paddr);
> > - npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
> > -
> > - if (op == SNP_PAGE_STATE_PRIVATE)
> > - early_snp_set_memory_private(vaddr, paddr, npages);
> > - else if (op == SNP_PAGE_STATE_SHARED)
> > - early_snp_set_memory_shared(vaddr, paddr, npages);
> > - else
> > - WARN(1, "invalid memory op %d\n", op);
> > -}
> > -
> > static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
> >       unsigned long vaddr_end, int op)
> > {
> > diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
> > index 015c95a825d3..22e27087eb5b 100644
> > --- a/drivers/firmware/dmi_scan.c
> > +++ b/drivers/firmware/dmi_scan.c
> > @@ -703,7 +703,12 @@ static void __init dmi_scan_machine(void)
> > dmi_available = 1;
> > return;
> > }
> > - } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK)) {
> > + } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK) &&
> > + !cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
> > + /*
> > + * This scan is skipped in SEV-SNP guests because the ROM range
> > + * is not pre-validated, meaning access would cause a crash.
> > + */
> > p = dmi_early_remap(SMBIOS_ENTRY_POINT_SCAN_START, 0x10000);
> > if (p == NULL)
> > goto error;
> > --
> > 2.44.0.rc0.258.g7320e95886-goog
> >
> >
>
> In addition to these changes, I also had to skip pirq_find_routing_table if SEV-SNP is active.

Thanks. I will update this in v3.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests
  2024-02-29 16:54           ` Borislav Petkov
@ 2024-03-08 16:14             ` Kevin Loughlin
  0 siblings, 0 replies; 24+ messages in thread
From: Kevin Loughlin @ 2024-03-08 16:14 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: acdunlap, alexander.shishkin, andrisaar, ardb, bhe,
	brijesh.singh, dave.hansen, dionnaglaze, grobler, hpa, jacobhxu,
	jpoimboe, kai.huang, linux-kernel, michael.roth, mingo, peterz,
	pgonda, ross.lagerwall, sidtelang, tglx, thomas.lendacky, x86,
	ytcoode

On Thu, Feb 29, 2024 at 11:55 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Thu, Feb 22, 2024 at 08:24:04PM +0000, Kevin Loughlin wrote:
> > SEV-SNP requires encrypted memory to be validated before access.
> > Because the ROM memory range is not part of the e820 table, it is not
> > pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
> > to access this range, the guest must first validate the range.
> >
> > The current SEV-SNP code does indeed scan the ROM range during early
> > boot and thus attempts to validate the ROM range in probe_roms().
> > However, this behavior is neither necessary nor sufficient.
>
> Why is this not necessary, all of a sudden?
>
> > With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
> > CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
>
> What is that use case exactly?
>
> CONFIG_DMI_... is usually enabled but the absence of EFI_CONFIG_TABLES
> tells me that you're booting some guest with some special OVMF which
> doesn't sport such tables.
>
> Why?
>
> /me scrolls upthread
>
> Aha, some project oak thing doing a minimal fw. I can see why but this
> should be explained here as to why is this a relevant use case and what
> it is using and so on so that future readers can piece it all together.

Will do in v3 commit message, thanks.

>
> > attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> > falls in the ROM range) prior to validation. The specific problematic
> > call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> > in a crash during boot if SEV-SNP is enabled under these conditions.
> >
> > With regards to necessity, SEV-SNP guests currently read garbage (which
> > changes across boots) from the ROM range, meaning these scans are
> > unnecessary. The guest reads garbage because the legacy ROM range
> > is unencrypted data but is accessed via an encrypted PMD during early
> > boot (where the PMD is marked as encrypted due to potentially mapping
> > actually-encrypted data in other PMD-contained ranges).
>
> I don't mind ripping that ROM probing thing but that thread we're on
> here talks more about why it could be problematic to keep doing so so
> pls summarize that here.
>
> A commit should contain all arguments for why it has been arrived at
> the decision to do it this way.

Ditto.

>
> > While one solution would be to overhaul the early PMD mapping to treat
> > the ROM region of the PMD as unencrypted, SEV-SNP guests do not rely on
> > data from the legacy ROM region during early boot (nor can they
> > currently, since the data would be garbage that changes across boots).
>
> That's better.
>
> > As such, this patch opts for the simpler approach of skipping the ROM
>
> Avoid having "This patch" or "This commit" in the commit message. It is
> tautologically useless.
>
> Also, do
>
> $ git grep 'This patch' Documentation/process
>
> for more details.

Ack, will fix.

>
> > range scans (and the otherwise-necessary range validation) during
> > SEV-SNP guest early boot.
> >
> > Ultimatly, the potential SEV-SNP guest crash due to lack of ROM range
>   ^^^^^^^^^^
>
> Please introduce a spellchecker into your patch creation workflow.

Woops, thanks. I'll fix that.

>
> > validation is avoided by simply not accessing the ROM range.
> >
> > Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
> > Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
> > ---
> >  arch/x86/include/asm/sev.h   |  2 --
> >  arch/x86/kernel/mpparse.c    |  7 +++++++
> >  arch/x86/kernel/probe_roms.c | 11 ++++-------
> >  arch/x86/kernel/sev.c        | 15 ---------------
> >  drivers/firmware/dmi_scan.c  |  7 ++++++-
> >  5 files changed, 17 insertions(+), 25 deletions(-)
>
> ...
>
> > diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
> > index b223922248e9..39ea771e2d4c 100644
> > --- a/arch/x86/kernel/mpparse.c
> > +++ b/arch/x86/kernel/mpparse.c
> > @@ -553,6 +553,13 @@ static int __init smp_scan_config(unsigned long base, unsigned long length)
> >                   base, base + length - 1);
> >       BUILD_BUG_ON(sizeof(*mpf) != 16);
> >
> > +     /*
> > +      * Skip scan in SEV-SNP guest if it would touch the legacy ROM region,
> > +      * as this memory is not pre-validated and would thus cause a crash.
> > +      */
> > +     if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && base < 0x100000 && base + length >= 0xC0000)
> > +             return 0;
>
> I don't like spreading around CoCo checks everywhere around the tree.
>
> Think of a better way pls.

Will do. I'll follow up on this in a message in reply to the
subsequent discussion involving Ard.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests
  2024-03-08 16:10             ` Kevin Loughlin
@ 2024-03-08 20:41               ` Michael Roth
  2024-03-08 21:30                 ` Kevin Loughlin
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Roth @ 2024-03-08 20:41 UTC (permalink / raw)
  To: Kevin Loughlin
  Cc: Mike Stunes, acdunlap, alexander.shishkin, andrisaar, ardb, bhe,
	Borislav Petkov, Brijesh Singh, dave.hansen, Dionna Amalie Glaze,
	grobler, hpa, jacobhxu, jpoimboe, Huang, Kai, linux-kernel,
	mingo, peterz, Peter Gonda, ross.lagerwall, sidtelang, tglx,
	Thomas Lendacky, x86, ytcoode

On Fri, Mar 08, 2024 at 11:10:43AM -0500, Kevin Loughlin wrote:
> On Mon, Feb 26, 2024 at 2:16 PM Mike Stunes <mike.stunes@broadcom.com> wrote:
> >
> > Hi,
> >
> > > On Feb 22, 2024, at 12:24 PM, Kevin Loughlin <kevinloughlin@google.com> wrote:
> > >
> > > SEV-SNP requires encrypted memory to be validated before access.
> > > Because the ROM memory range is not part of the e820 table, it is not
> > > pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
> > > to access this range, the guest must first validate the range.
> > >
> > > The current SEV-SNP code does indeed scan the ROM range during early
> > > boot and thus attempts to validate the ROM range in probe_roms().
> > > However, this behavior is neither necessary nor sufficient.
> > >
> > > With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
> > > CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
> > > attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> > > falls in the ROM range) prior to validation. The specific problematic
> > > call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> > > in a crash during boot if SEV-SNP is enabled under these conditions.
> > >
> > > With regards to necessity, SEV-SNP guests currently read garbage (which
> > > changes across boots) from the ROM range, meaning these scans are
> > > unnecessary. The guest reads garbage because the legacy ROM range
> > > is unencrypted data but is accessed via an encrypted PMD during early
> > > boot (where the PMD is marked as encrypted due to potentially mapping
> > > actually-encrypted data in other PMD-contained ranges).
> > >
> > > While one solution would be to overhaul the early PMD mapping to treat
> > > the ROM region of the PMD as unencrypted, SEV-SNP guests do not rely on
> > > data from the legacy ROM region during early boot (nor can they
> > > currently, since the data would be garbage that changes across boots).
> > > As such, this patch opts for the simpler approach of skipping the ROM
> > > range scans (and the otherwise-necessary range validation) during
> > > SEV-SNP guest early boot.
> > >
> > > Ultimatly, the potential SEV-SNP guest crash due to lack of ROM range
> > > validation is avoided by simply not accessing the ROM range.
> > >
> > > Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
> > > Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
> > > ---
> > > arch/x86/include/asm/sev.h   |  2 --
> > > arch/x86/kernel/mpparse.c    |  7 +++++++
> > > arch/x86/kernel/probe_roms.c | 11 ++++-------
> > > arch/x86/kernel/sev.c        | 15 ---------------
> > > drivers/firmware/dmi_scan.c  |  7 ++++++-
> > > 5 files changed, 17 insertions(+), 25 deletions(-)
> > >
> > > diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> > > index 5b4a1ce3d368..474c24ba0f6f 100644
> > > --- a/arch/x86/include/asm/sev.h
> > > +++ b/arch/x86/include/asm/sev.h
> > > @@ -203,7 +203,6 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
> > > unsigned long npages);
> > > void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
> > > unsigned long npages);
> > > -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
> > > void snp_set_memory_shared(unsigned long vaddr, unsigned long npages);
> > > void snp_set_memory_private(unsigned long vaddr, unsigned long npages);
> > > void snp_set_wakeup_secondary_cpu(void);
> > > @@ -227,7 +226,6 @@ static inline void __init
> > > early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> > > static inline void __init
> > > early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> > > -static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
> > > static inline void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) { }
> > > static inline void snp_set_memory_private(unsigned long vaddr, unsigned long npages) { }
> > > static inline void snp_set_wakeup_secondary_cpu(void) { }
> > > diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
> > > index b223922248e9..39ea771e2d4c 100644
> > > --- a/arch/x86/kernel/mpparse.c
> > > +++ b/arch/x86/kernel/mpparse.c
> > > @@ -553,6 +553,13 @@ static int __init smp_scan_config(unsigned long base, unsigned long length)
> > >    base, base + length - 1);
> > > BUILD_BUG_ON(sizeof(*mpf) != 16);
> > >
> > > + /*
> > > + * Skip scan in SEV-SNP guest if it would touch the legacy ROM region,
> > > + * as this memory is not pre-validated and would thus cause a crash.
> > > + */
> > > + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && base < 0x100000 && base + length >= 0xC0000)
> > > + return 0;
> > > +
> > > while (length > 0) {
> > > bp = early_memremap(base, length);
> > > mpf = (struct mpf_intel *)bp;
> > > diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
> > > index 319fef37d9dc..84ff4b052fc1 100644
> > > --- a/arch/x86/kernel/probe_roms.c
> > > +++ b/arch/x86/kernel/probe_roms.c
> > > @@ -204,14 +204,11 @@ void __init probe_roms(void)
> > > int i;
> > >
> > > /*
> > > - * The ROM memory range is not part of the e820 table and is therefore not
> > > - * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted
> > > - * memory, and SNP requires encrypted memory to be validated before access.
> > > - * Do that here.
> > > + * These probes are skipped in SEV-SNP guests because the ROM range
> > > + * is not pre-validated, meaning access would cause a crash.
> > > */
> > > - snp_prep_memory(video_rom_resource.start,
> > > - ((system_rom_resource.end + 1) - video_rom_resource.start),
> > > - SNP_PAGE_STATE_PRIVATE);
> > > + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> > > + return;
> > >
> > > /* video rom */
> > > upper = adapter_rom_resources[0].start;
> > > diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> > > index c67285824e82..d2362631da91 100644
> > > --- a/arch/x86/kernel/sev.c
> > > +++ b/arch/x86/kernel/sev.c
> > > @@ -774,21 +774,6 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
> > > early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED);
> > > }
> > >
> > > -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
> > > -{
> > > - unsigned long vaddr, npages;
> > > -
> > > - vaddr = (unsigned long)__va(paddr);
> > > - npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
> > > -
> > > - if (op == SNP_PAGE_STATE_PRIVATE)
> > > - early_snp_set_memory_private(vaddr, paddr, npages);
> > > - else if (op == SNP_PAGE_STATE_SHARED)
> > > - early_snp_set_memory_shared(vaddr, paddr, npages);
> > > - else
> > > - WARN(1, "invalid memory op %d\n", op);
> > > -}
> > > -
> > > static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
> > >       unsigned long vaddr_end, int op)
> > > {
> > > diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
> > > index 015c95a825d3..22e27087eb5b 100644
> > > --- a/drivers/firmware/dmi_scan.c
> > > +++ b/drivers/firmware/dmi_scan.c
> > > @@ -703,7 +703,12 @@ static void __init dmi_scan_machine(void)
> > > dmi_available = 1;
> > > return;
> > > }
> > > - } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK)) {
> > > + } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK) &&
> > > + !cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
> > > + /*
> > > + * This scan is skipped in SEV-SNP guests because the ROM range
> > > + * is not pre-validated, meaning access would cause a crash.
> > > + */
> > > p = dmi_early_remap(SMBIOS_ENTRY_POINT_SCAN_START, 0x10000);
> > > if (p == NULL)
> > > goto error;
> > > --
> > > 2.44.0.rc0.258.g7320e95886-goog
> > >
> > >
> >
> > In addition to these changes, I also had to skip pirq_find_routing_table if SEV-SNP is active.
> 
> Thanks. I will update this in v3.

There's also another access a bit later in boot:

  static __init int eisa_bus_probe(void)
  {
    ...
    ioremap(0x0FFFD9, 4);
  }

This time it's via ioremap() with the encryption bit *unset*, so it
won't necessarily cause a crash but it's inconsistent with the early
page table having that region set as encrypted.

We discussed unsetting the encryption bit in early page table with
security folks and the general consensus was that *if* any VMM/firmware
ever came along that does want to make use of legacy region for any reason
(such as providing DMI/SMBIOS info) it would be safest to require that they
encrypt the data in the region before handing off to guest kernel, so it
makes sense to patch away unecrypted accesses to the legacy region so the
don't cause problems down the road (like causing implicit page state
change from private->shared and throwing away data in the region later
in boot).

-Mike

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests
  2024-03-08 10:30           ` Ard Biesheuvel
  2024-03-08 11:00             ` Borislav Petkov
@ 2024-03-08 20:50             ` Kevin Loughlin
  1 sibling, 0 replies; 24+ messages in thread
From: Kevin Loughlin @ 2024-03-08 20:50 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: acdunlap, alexander.shishkin, andrisaar, bhe, bp, brijesh.singh,
	dave.hansen, dionnaglaze, grobler, hpa, jacobhxu, jpoimboe,
	kai.huang, linux-kernel, michael.roth, mingo, peterz, pgonda,
	ross.lagerwall, sidtelang, tglx, thomas.lendacky, x86, ytcoode

On Fri, Mar 8, 2024 at 5:31 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Thu, 22 Feb 2024 at 21:25, Kevin Loughlin <kevinloughlin@google.com> wrote:
> >
> > SEV-SNP requires encrypted memory to be validated before access.
> > Because the ROM memory range is not part of the e820 table, it is not
> > pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
> > to access this range, the guest must first validate the range.
> >
> > The current SEV-SNP code does indeed scan the ROM range during early
> > boot and thus attempts to validate the ROM range in probe_roms().
> > However, this behavior is neither necessary nor sufficient.
> >
> > With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
> > CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
> > attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> > falls in the ROM range) prior to validation. The specific problematic
> > call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> > in a crash during boot if SEV-SNP is enabled under these conditions.
> >
> > With regards to necessity, SEV-SNP guests currently read garbage (which
> > changes across boots) from the ROM range, meaning these scans are
> > unnecessary. The guest reads garbage because the legacy ROM range
> > is unencrypted data but is accessed via an encrypted PMD during early
> > boot (where the PMD is marked as encrypted due to potentially mapping
> > actually-encrypted data in other PMD-contained ranges).
> >
> > While one solution would be to overhaul the early PMD mapping to treat
> > the ROM region of the PMD as unencrypted, SEV-SNP guests do not rely on
> > data from the legacy ROM region during early boot (nor can they
> > currently, since the data would be garbage that changes across boots).
> > As such, this patch opts for the simpler approach of skipping the ROM
> > range scans (and the otherwise-necessary range validation) during
> > SEV-SNP guest early boot.
> >
> > Ultimatly, the potential SEV-SNP guest crash due to lack of ROM range
> > validation is avoided by simply not accessing the ROM range.
> >
> > Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
> > Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
> > ---
> >  arch/x86/include/asm/sev.h   |  2 --
> >  arch/x86/kernel/mpparse.c    |  7 +++++++
> >  arch/x86/kernel/probe_roms.c | 11 ++++-------
> >  arch/x86/kernel/sev.c        | 15 ---------------
> >  drivers/firmware/dmi_scan.c  |  7 ++++++-
> >  5 files changed, 17 insertions(+), 25 deletions(-)
> >
>
> Agree with the analysis and the conclusion. However, this will need to
> be split into generic and x86 specific changes, given that the DMI
> code is shared between all architectures, and explicitly checking for
> SEV-SNP support in generic code is not appropriate.
>
> So what we will need is:
> - a generic change that implements a static inline wrapper around
> IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK), and wires it up
> in  drivers/firmware/dmi_scan.c;
> - a x86 specific change that overrides this DMI helper in terms of
> cc_platform_has(CC_ATTR_GUEST_SEV_SNP);
> - x86 specific changes that deal with the other scanning
>
> Note that this means that Oak based platforms will lose DMI reporting
> and DMI-based quirks, but I think this is reasonable.

Agreed. However, upon further review, I think we can get away with
only modifying arch/x86/ code.

Besides the DMI case, all other needed changes are already contained
in arch/x86/, and we can replace the relevant init functions for
SEV-SNP guests with empty stubs as Boris and you mention in our
discussion.

For the DMI case, we can add an x86-init function pointer to
dmi_setup() that defaults to the generic dmi_setup function(), which
would be modified to point to snp_dmi_setup() on SNP-enabled guests
during initialization (where the fallback scan would be skipped for
SNP guests). This way, we would both leave multi-arch code alone and
avoid spreading cc_platform_has() scans around as Boris mentioned.

I plan to implement this behavior in v3 unless you have a preference
for something different.

>
> More feedback below.
>
>
> > diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> > index 5b4a1ce3d368..474c24ba0f6f 100644
> > --- a/arch/x86/include/asm/sev.h
> > +++ b/arch/x86/include/asm/sev.h
> > @@ -203,7 +203,6 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
> >                                          unsigned long npages);
> >  void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
> >                                         unsigned long npages);
> > -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
> >  void snp_set_memory_shared(unsigned long vaddr, unsigned long npages);
> >  void snp_set_memory_private(unsigned long vaddr, unsigned long npages);
> >  void snp_set_wakeup_secondary_cpu(void);
> > @@ -227,7 +226,6 @@ static inline void __init
> >  early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> >  static inline void __init
> >  early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> > -static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
> >  static inline void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) { }
> >  static inline void snp_set_memory_private(unsigned long vaddr, unsigned long npages) { }
> >  static inline void snp_set_wakeup_secondary_cpu(void) { }
> > diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
> > index b223922248e9..39ea771e2d4c 100644
> > --- a/arch/x86/kernel/mpparse.c
> > +++ b/arch/x86/kernel/mpparse.c
> > @@ -553,6 +553,13 @@ static int __init smp_scan_config(unsigned long base, unsigned long length)
> >                     base, base + length - 1);
> >         BUILD_BUG_ON(sizeof(*mpf) != 16);
> >
> > +       /*
> > +        * Skip scan in SEV-SNP guest if it would touch the legacy ROM region,
> > +        * as this memory is not pre-validated and would thus cause a crash.
> > +        */
> > +       if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && base < 0x100000 && base + length >= 0xC0000)
> > +               return 0;
> > +
>
> Please don't use magic numbers like this, and use memory_intersects()
> [unless there is a reason to avoid it which I missed]

Yes, memory_intersects() is better, as are macros here. Thanks.

>
> Also, really?!? Does modern x86 still rely on scanning arbitrary
> regions of memory for magic numbers? Or is this only for those who
> prefer vintage boot protocols?
>
> If so, I suppose we might need a generic helper
>
> static inline bool platform_allows_memory_probing(void)
>
> [modulo bikeshedding over the name] where the generic implementation
> returns false, and the x86 implementation could take
> cc_platform_has(CC_ATTR_GUEST_SEV_SNP) into account, and return true
> otherwise.
>
> (On ARM based systems, memory probing is never ok, because the memory
> map is not architected, and so probing random addresses might bring
> down the machine)

Roughly-speaking, the x86 memory probes are generally performed to
support legacy devices/reserved regions/boot sequences that assume
these hardcoded addresses. Given the ability to point probe_roms() and
similar x86_init functions to empty stubs (and the fact that x86_init
functions are, by definition, x86-specific), we should be able to
avoid needing a "platform_allows_memory_probing()" function in these
cases.

As for the DMI probing behavior in dmi_scan_machine(), the probing
only currently occurs if both (a) the config tables are not provided
by EFI [i.e., `efi_enabled(EFI_CONFIG_TABLES)` is false] and (b)
DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set [which is not selected on
ARM, consistent with memory probing on ARM being disallowed]. As such,
DMI_SCAN_MACHINE_NON_EFI_FALLBACK effectively provides the
"platform_allows_memory_probing" functionality for this singular use
case.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests
  2024-03-08 11:00             ` Borislav Petkov
  2024-03-08 11:44               ` Ard Biesheuvel
@ 2024-03-08 20:52               ` Kevin Loughlin
  1 sibling, 0 replies; 24+ messages in thread
From: Kevin Loughlin @ 2024-03-08 20:52 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ard Biesheuvel, acdunlap, alexander.shishkin, andrisaar, bhe,
	brijesh.singh, dave.hansen, dionnaglaze, grobler, hpa, jacobhxu,
	jpoimboe, kai.huang, linux-kernel, michael.roth, mingo, peterz,
	pgonda, ross.lagerwall, sidtelang, tglx, thomas.lendacky, x86,
	ytcoode

On Fri, Mar 8, 2024 at 6:01 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Fri, Mar 08, 2024 at 11:30:50AM +0100, Ard Biesheuvel wrote:
> > Agree with the analysis and the conclusion. However, this will need to
> > be split into generic and x86 specific changes, given that the DMI
> > code is shared between all architectures, and explicitly checking for
> > SEV-SNP support in generic code is not appropriate.
> >
> > So what we will need is:
>
> I was actually thinking of:
>
>         x86_init.resources.probe_roms = snp_probe_roms;
>
> and snp_probe_roms() is an empty stub.
>
> Problem solved without ugly sprinkling of checks everywhere.

Agreed, this is nicer; we can just add dmi_setup to x86_init as well
to implement the same concept for all checks we need.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests
  2024-03-08 20:41               ` Michael Roth
@ 2024-03-08 21:30                 ` Kevin Loughlin
  2024-03-08 23:01                   ` Michael Roth
  0 siblings, 1 reply; 24+ messages in thread
From: Kevin Loughlin @ 2024-03-08 21:30 UTC (permalink / raw)
  To: Michael Roth
  Cc: Mike Stunes, acdunlap, alexander.shishkin, andrisaar, ardb, bhe,
	Borislav Petkov, Brijesh Singh, dave.hansen, Dionna Amalie Glaze,
	grobler, hpa, jacobhxu, jpoimboe, Huang, Kai, linux-kernel,
	mingo, peterz, Peter Gonda, ross.lagerwall, sidtelang, tglx,
	Thomas Lendacky, x86, ytcoode

On Fri, Mar 8, 2024 at 3:44 PM Michael Roth <michael.roth@amd.com> wrote:
>
> On Fri, Mar 08, 2024 at 11:10:43AM -0500, Kevin Loughlin wrote:
> > On Mon, Feb 26, 2024 at 2:16 PM Mike Stunes <mike.stunes@broadcom.com> wrote:
> > >
> > > Hi,
> > >
> > > > On Feb 22, 2024, at 12:24 PM, Kevin Loughlin <kevinloughlin@google.com> wrote:
> > > >
> > > > SEV-SNP requires encrypted memory to be validated before access.
> > > > Because the ROM memory range is not part of the e820 table, it is not
> > > > pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
> > > > to access this range, the guest must first validate the range.
> > > >
> > > > The current SEV-SNP code does indeed scan the ROM range during early
> > > > boot and thus attempts to validate the ROM range in probe_roms().
> > > > However, this behavior is neither necessary nor sufficient.
> > > >
> > > > With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
> > > > CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
> > > > attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> > > > falls in the ROM range) prior to validation. The specific problematic
> > > > call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> > > > in a crash during boot if SEV-SNP is enabled under these conditions.
> > > >
> > > > With regards to necessity, SEV-SNP guests currently read garbage (which
> > > > changes across boots) from the ROM range, meaning these scans are
> > > > unnecessary. The guest reads garbage because the legacy ROM range
> > > > is unencrypted data but is accessed via an encrypted PMD during early
> > > > boot (where the PMD is marked as encrypted due to potentially mapping
> > > > actually-encrypted data in other PMD-contained ranges).
> > > >
> > > > While one solution would be to overhaul the early PMD mapping to treat
> > > > the ROM region of the PMD as unencrypted, SEV-SNP guests do not rely on
> > > > data from the legacy ROM region during early boot (nor can they
> > > > currently, since the data would be garbage that changes across boots).
> > > > As such, this patch opts for the simpler approach of skipping the ROM
> > > > range scans (and the otherwise-necessary range validation) during
> > > > SEV-SNP guest early boot.
> > > >
> > > > Ultimatly, the potential SEV-SNP guest crash due to lack of ROM range
> > > > validation is avoided by simply not accessing the ROM range.
> > > >
> > > > Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
> > > > Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
> > > > ---
> > > > arch/x86/include/asm/sev.h   |  2 --
> > > > arch/x86/kernel/mpparse.c    |  7 +++++++
> > > > arch/x86/kernel/probe_roms.c | 11 ++++-------
> > > > arch/x86/kernel/sev.c        | 15 ---------------
> > > > drivers/firmware/dmi_scan.c  |  7 ++++++-
> > > > 5 files changed, 17 insertions(+), 25 deletions(-)
> > > >
> > > > diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> > > > index 5b4a1ce3d368..474c24ba0f6f 100644
> > > > --- a/arch/x86/include/asm/sev.h
> > > > +++ b/arch/x86/include/asm/sev.h
> > > > @@ -203,7 +203,6 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
> > > > unsigned long npages);
> > > > void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
> > > > unsigned long npages);
> > > > -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
> > > > void snp_set_memory_shared(unsigned long vaddr, unsigned long npages);
> > > > void snp_set_memory_private(unsigned long vaddr, unsigned long npages);
> > > > void snp_set_wakeup_secondary_cpu(void);
> > > > @@ -227,7 +226,6 @@ static inline void __init
> > > > early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> > > > static inline void __init
> > > > early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> > > > -static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
> > > > static inline void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) { }
> > > > static inline void snp_set_memory_private(unsigned long vaddr, unsigned long npages) { }
> > > > static inline void snp_set_wakeup_secondary_cpu(void) { }
> > > > diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
> > > > index b223922248e9..39ea771e2d4c 100644
> > > > --- a/arch/x86/kernel/mpparse.c
> > > > +++ b/arch/x86/kernel/mpparse.c
> > > > @@ -553,6 +553,13 @@ static int __init smp_scan_config(unsigned long base, unsigned long length)
> > > >    base, base + length - 1);
> > > > BUILD_BUG_ON(sizeof(*mpf) != 16);
> > > >
> > > > + /*
> > > > + * Skip scan in SEV-SNP guest if it would touch the legacy ROM region,
> > > > + * as this memory is not pre-validated and would thus cause a crash.
> > > > + */
> > > > + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && base < 0x100000 && base + length >= 0xC0000)
> > > > + return 0;
> > > > +
> > > > while (length > 0) {
> > > > bp = early_memremap(base, length);
> > > > mpf = (struct mpf_intel *)bp;
> > > > diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
> > > > index 319fef37d9dc..84ff4b052fc1 100644
> > > > --- a/arch/x86/kernel/probe_roms.c
> > > > +++ b/arch/x86/kernel/probe_roms.c
> > > > @@ -204,14 +204,11 @@ void __init probe_roms(void)
> > > > int i;
> > > >
> > > > /*
> > > > - * The ROM memory range is not part of the e820 table and is therefore not
> > > > - * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted
> > > > - * memory, and SNP requires encrypted memory to be validated before access.
> > > > - * Do that here.
> > > > + * These probes are skipped in SEV-SNP guests because the ROM range
> > > > + * is not pre-validated, meaning access would cause a crash.
> > > > */
> > > > - snp_prep_memory(video_rom_resource.start,
> > > > - ((system_rom_resource.end + 1) - video_rom_resource.start),
> > > > - SNP_PAGE_STATE_PRIVATE);
> > > > + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> > > > + return;
> > > >
> > > > /* video rom */
> > > > upper = adapter_rom_resources[0].start;
> > > > diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> > > > index c67285824e82..d2362631da91 100644
> > > > --- a/arch/x86/kernel/sev.c
> > > > +++ b/arch/x86/kernel/sev.c
> > > > @@ -774,21 +774,6 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
> > > > early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED);
> > > > }
> > > >
> > > > -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
> > > > -{
> > > > - unsigned long vaddr, npages;
> > > > -
> > > > - vaddr = (unsigned long)__va(paddr);
> > > > - npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
> > > > -
> > > > - if (op == SNP_PAGE_STATE_PRIVATE)
> > > > - early_snp_set_memory_private(vaddr, paddr, npages);
> > > > - else if (op == SNP_PAGE_STATE_SHARED)
> > > > - early_snp_set_memory_shared(vaddr, paddr, npages);
> > > > - else
> > > > - WARN(1, "invalid memory op %d\n", op);
> > > > -}
> > > > -
> > > > static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
> > > >       unsigned long vaddr_end, int op)
> > > > {
> > > > diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
> > > > index 015c95a825d3..22e27087eb5b 100644
> > > > --- a/drivers/firmware/dmi_scan.c
> > > > +++ b/drivers/firmware/dmi_scan.c
> > > > @@ -703,7 +703,12 @@ static void __init dmi_scan_machine(void)
> > > > dmi_available = 1;
> > > > return;
> > > > }
> > > > - } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK)) {
> > > > + } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK) &&
> > > > + !cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
> > > > + /*
> > > > + * This scan is skipped in SEV-SNP guests because the ROM range
> > > > + * is not pre-validated, meaning access would cause a crash.
> > > > + */
> > > > p = dmi_early_remap(SMBIOS_ENTRY_POINT_SCAN_START, 0x10000);
> > > > if (p == NULL)
> > > > goto error;
> > > > --
> > > > 2.44.0.rc0.258.g7320e95886-goog
> > > >
> > > >
> > >
> > > In addition to these changes, I also had to skip pirq_find_routing_table if SEV-SNP is active.
> >
> > Thanks. I will update this in v3.
>
> There's also another access a bit later in boot:
>
>   static __init int eisa_bus_probe(void)
>   {
>     ...
>     ioremap(0x0FFFD9, 4);
>   }
>
> This time it's via ioremap() with the encryption bit *unset*, so it
> won't necessarily cause a crash but it's inconsistent with the early
> page table having that region set as encrypted.
>
> We discussed unsetting the encryption bit in early page table with
> security folks and the general consensus was that *if* any VMM/firmware
> ever came along that does want to make use of legacy region for any reason
> (such as providing DMI/SMBIOS info) it would be safest to require that they
> encrypt the data in the region before handing off to guest kernel, so it
> makes sense to patch away unecrypted accesses to the legacy region so the
> don't cause problems down the road (like causing implicit page state
> change from private->shared and throwing away data in the region later
> in boot).

Sounds good, thanks. Since this one won't cause crashes, I will place
it in a separate patch in the series to separate (current) functional
fixes from cleanup, especially since there may be similar legacy
probes to cleanup in various types of guests. Please let me know if
you feel differently or have additional thoughts.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests
  2024-03-08 21:30                 ` Kevin Loughlin
@ 2024-03-08 23:01                   ` Michael Roth
  2024-03-11 21:05                     ` Kevin Loughlin
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Roth @ 2024-03-08 23:01 UTC (permalink / raw)
  To: Kevin Loughlin
  Cc: Mike Stunes, acdunlap, alexander.shishkin, andrisaar, ardb, bhe,
	Borislav Petkov, Brijesh Singh, dave.hansen, Dionna Amalie Glaze,
	grobler, hpa, jacobhxu, jpoimboe, Huang, Kai, linux-kernel,
	mingo, peterz, Peter Gonda, ross.lagerwall, sidtelang, tglx,
	Thomas Lendacky, x86, ytcoode

On Fri, Mar 08, 2024 at 04:30:55PM -0500, Kevin Loughlin wrote:
> On Fri, Mar 8, 2024 at 3:44 PM Michael Roth <michael.roth@amd.com> wrote:
> >
> > On Fri, Mar 08, 2024 at 11:10:43AM -0500, Kevin Loughlin wrote:
> > > On Mon, Feb 26, 2024 at 2:16 PM Mike Stunes <mike.stunes@broadcom.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > > On Feb 22, 2024, at 12:24 PM, Kevin Loughlin <kevinloughlin@google.com> wrote:
> > > > >
> > > > > SEV-SNP requires encrypted memory to be validated before access.
> > > > > Because the ROM memory range is not part of the e820 table, it is not
> > > > > pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
> > > > > to access this range, the guest must first validate the range.
> > > > >
> > > > > The current SEV-SNP code does indeed scan the ROM range during early
> > > > > boot and thus attempts to validate the ROM range in probe_roms().
> > > > > However, this behavior is neither necessary nor sufficient.
> > > > >
> > > > > With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
> > > > > CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
> > > > > attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> > > > > falls in the ROM range) prior to validation. The specific problematic
> > > > > call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> > > > > in a crash during boot if SEV-SNP is enabled under these conditions.
> > > > >
> > > > > With regards to necessity, SEV-SNP guests currently read garbage (which
> > > > > changes across boots) from the ROM range, meaning these scans are
> > > > > unnecessary. The guest reads garbage because the legacy ROM range
> > > > > is unencrypted data but is accessed via an encrypted PMD during early
> > > > > boot (where the PMD is marked as encrypted due to potentially mapping
> > > > > actually-encrypted data in other PMD-contained ranges).
> > > > >
> > > > > While one solution would be to overhaul the early PMD mapping to treat
> > > > > the ROM region of the PMD as unencrypted, SEV-SNP guests do not rely on
> > > > > data from the legacy ROM region during early boot (nor can they
> > > > > currently, since the data would be garbage that changes across boots).
> > > > > As such, this patch opts for the simpler approach of skipping the ROM
> > > > > range scans (and the otherwise-necessary range validation) during
> > > > > SEV-SNP guest early boot.
> > > > >
> > > > > Ultimatly, the potential SEV-SNP guest crash due to lack of ROM range
> > > > > validation is avoided by simply not accessing the ROM range.
> > > > >
> > > > > Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
> > > > > Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
> > > > > ---
> > > > > arch/x86/include/asm/sev.h   |  2 --
> > > > > arch/x86/kernel/mpparse.c    |  7 +++++++
> > > > > arch/x86/kernel/probe_roms.c | 11 ++++-------
> > > > > arch/x86/kernel/sev.c        | 15 ---------------
> > > > > drivers/firmware/dmi_scan.c  |  7 ++++++-
> > > > > 5 files changed, 17 insertions(+), 25 deletions(-)
> > > > >
> > > > > diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> > > > > index 5b4a1ce3d368..474c24ba0f6f 100644
> > > > > --- a/arch/x86/include/asm/sev.h
> > > > > +++ b/arch/x86/include/asm/sev.h
> > > > > @@ -203,7 +203,6 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
> > > > > unsigned long npages);
> > > > > void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
> > > > > unsigned long npages);
> > > > > -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
> > > > > void snp_set_memory_shared(unsigned long vaddr, unsigned long npages);
> > > > > void snp_set_memory_private(unsigned long vaddr, unsigned long npages);
> > > > > void snp_set_wakeup_secondary_cpu(void);
> > > > > @@ -227,7 +226,6 @@ static inline void __init
> > > > > early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> > > > > static inline void __init
> > > > > early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> > > > > -static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
> > > > > static inline void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) { }
> > > > > static inline void snp_set_memory_private(unsigned long vaddr, unsigned long npages) { }
> > > > > static inline void snp_set_wakeup_secondary_cpu(void) { }
> > > > > diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
> > > > > index b223922248e9..39ea771e2d4c 100644
> > > > > --- a/arch/x86/kernel/mpparse.c
> > > > > +++ b/arch/x86/kernel/mpparse.c
> > > > > @@ -553,6 +553,13 @@ static int __init smp_scan_config(unsigned long base, unsigned long length)
> > > > >    base, base + length - 1);
> > > > > BUILD_BUG_ON(sizeof(*mpf) != 16);
> > > > >
> > > > > + /*
> > > > > + * Skip scan in SEV-SNP guest if it would touch the legacy ROM region,
> > > > > + * as this memory is not pre-validated and would thus cause a crash.
> > > > > + */
> > > > > + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && base < 0x100000 && base + length >= 0xC0000)
> > > > > + return 0;
> > > > > +
> > > > > while (length > 0) {
> > > > > bp = early_memremap(base, length);
> > > > > mpf = (struct mpf_intel *)bp;
> > > > > diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
> > > > > index 319fef37d9dc..84ff4b052fc1 100644
> > > > > --- a/arch/x86/kernel/probe_roms.c
> > > > > +++ b/arch/x86/kernel/probe_roms.c
> > > > > @@ -204,14 +204,11 @@ void __init probe_roms(void)
> > > > > int i;
> > > > >
> > > > > /*
> > > > > - * The ROM memory range is not part of the e820 table and is therefore not
> > > > > - * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted
> > > > > - * memory, and SNP requires encrypted memory to be validated before access.
> > > > > - * Do that here.
> > > > > + * These probes are skipped in SEV-SNP guests because the ROM range
> > > > > + * is not pre-validated, meaning access would cause a crash.
> > > > > */
> > > > > - snp_prep_memory(video_rom_resource.start,
> > > > > - ((system_rom_resource.end + 1) - video_rom_resource.start),
> > > > > - SNP_PAGE_STATE_PRIVATE);
> > > > > + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> > > > > + return;
> > > > >
> > > > > /* video rom */
> > > > > upper = adapter_rom_resources[0].start;
> > > > > diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> > > > > index c67285824e82..d2362631da91 100644
> > > > > --- a/arch/x86/kernel/sev.c
> > > > > +++ b/arch/x86/kernel/sev.c
> > > > > @@ -774,21 +774,6 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
> > > > > early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED);
> > > > > }
> > > > >
> > > > > -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
> > > > > -{
> > > > > - unsigned long vaddr, npages;
> > > > > -
> > > > > - vaddr = (unsigned long)__va(paddr);
> > > > > - npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
> > > > > -
> > > > > - if (op == SNP_PAGE_STATE_PRIVATE)
> > > > > - early_snp_set_memory_private(vaddr, paddr, npages);
> > > > > - else if (op == SNP_PAGE_STATE_SHARED)
> > > > > - early_snp_set_memory_shared(vaddr, paddr, npages);
> > > > > - else
> > > > > - WARN(1, "invalid memory op %d\n", op);
> > > > > -}
> > > > > -
> > > > > static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
> > > > >       unsigned long vaddr_end, int op)
> > > > > {
> > > > > diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
> > > > > index 015c95a825d3..22e27087eb5b 100644
> > > > > --- a/drivers/firmware/dmi_scan.c
> > > > > +++ b/drivers/firmware/dmi_scan.c
> > > > > @@ -703,7 +703,12 @@ static void __init dmi_scan_machine(void)
> > > > > dmi_available = 1;
> > > > > return;
> > > > > }
> > > > > - } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK)) {
> > > > > + } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK) &&
> > > > > + !cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
> > > > > + /*
> > > > > + * This scan is skipped in SEV-SNP guests because the ROM range
> > > > > + * is not pre-validated, meaning access would cause a crash.
> > > > > + */
> > > > > p = dmi_early_remap(SMBIOS_ENTRY_POINT_SCAN_START, 0x10000);
> > > > > if (p == NULL)
> > > > > goto error;
> > > > > --
> > > > > 2.44.0.rc0.258.g7320e95886-goog
> > > > >
> > > > >
> > > >
> > > > In addition to these changes, I also had to skip pirq_find_routing_table if SEV-SNP is active.
> > >
> > > Thanks. I will update this in v3.
> >
> > There's also another access a bit later in boot:
> >
> >   static __init int eisa_bus_probe(void)
> >   {
> >     ...
> >     ioremap(0x0FFFD9, 4);
> >   }
> >
> > This time it's via ioremap() with the encryption bit *unset*, so it
> > won't necessarily cause a crash but it's inconsistent with the early
> > page table having that region set as encrypted.
> >
> > We discussed unsetting the encryption bit in early page table with
> > security folks and the general consensus was that *if* any VMM/firmware
> > ever came along that does want to make use of legacy region for any reason
> > (such as providing DMI/SMBIOS info) it would be safest to require that they
> > encrypt the data in the region before handing off to guest kernel, so it
> > makes sense to patch away unecrypted accesses to the legacy region so the
> > don't cause problems down the road (like causing implicit page state
> > change from private->shared and throwing away data in the region later
> > in boot).
> 
> Sounds good, thanks. Since this one won't cause crashes, I will place
> it in a separate patch in the series to separate (current) functional
> fixes from cleanup, especially since there may be similar legacy
> probes to cleanup in various types of guests. Please let me know if
> you feel differently or have additional thoughts.

I think it could still be argued that it's a fix. It's just that the
main set of fixes avoid reading garbage for VMM/firmwares that *don't*
encrypt these regions, whereas this additional fix handles the case for
VMM/firmwares that *do* encrypt these regions. It's possible they exist
in the case of SEV (though I don't know of any). Might still make sense
to distinguish the 2 cases since latter is more theoretical, but both
still address the kernel modifying its behavior based on scanning random
garbage for strings.

-Mike

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests
  2024-03-08 11:44               ` Ard Biesheuvel
@ 2024-03-10 17:12                 ` Kevin Loughlin
  2024-03-11 10:43                   ` Ard Biesheuvel
  0 siblings, 1 reply; 24+ messages in thread
From: Kevin Loughlin @ 2024-03-10 17:12 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Borislav Petkov, acdunlap, alexander.shishkin, andrisaar, bhe,
	brijesh.singh, dave.hansen, dionnaglaze, grobler, hpa, jacobhxu,
	jpoimboe, kai.huang, linux-kernel, michael.roth, mingo, peterz,
	pgonda, ross.lagerwall, sidtelang, tglx, thomas.lendacky, x86,
	ytcoode

On Fri, Mar 8, 2024 at 3:44 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Fri, 8 Mar 2024 at 12:01, Borislav Petkov <bp@alien8.de> wrote:
> >
> > On Fri, Mar 08, 2024 at 11:30:50AM +0100, Ard Biesheuvel wrote:
> > > Agree with the analysis and the conclusion. However, this will need to
> > > be split into generic and x86 specific changes, given that the DMI
> > > code is shared between all architectures, and explicitly checking for
> > > SEV-SNP support in generic code is not appropriate.
> > >
> > > So what we will need is:
> >
> > I was actually thinking of:
> >
> >         x86_init.resources.probe_roms = snp_probe_roms;
> >
> > and snp_probe_roms() is an empty stub.
> >
> > Problem solved without ugly sprinkling of checks everywhere.
> >
>
> Indeed. Setting the override could be done in
> init_hypervisor_platform(), which is called right before from
> setup_arch().

The call to init_hypervisor_platform() has a comment saying it must
come after dmi_setup() (i.e., init_hypervisor_platform() would *not*
work for doing a dmi_setup() override), so I'm currently planning to
do the overrides at the end of snp_init() in arch/x86/kernel/sev.c
instead (which comes before both). This would be somewhat similar to
how there are early setup functions for specific platforms that
perform init overrides for different reasons (example:
x86_ce4100_early_setup()). Open to other locations of course.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests
  2024-03-10 17:12                 ` Kevin Loughlin
@ 2024-03-11 10:43                   ` Ard Biesheuvel
  2024-03-13 12:15                     ` [PATCH v3] " Kevin Loughlin
  0 siblings, 1 reply; 24+ messages in thread
From: Ard Biesheuvel @ 2024-03-11 10:43 UTC (permalink / raw)
  To: Kevin Loughlin
  Cc: Borislav Petkov, acdunlap, alexander.shishkin, andrisaar, bhe,
	brijesh.singh, dave.hansen, dionnaglaze, grobler, hpa, jacobhxu,
	jpoimboe, kai.huang, linux-kernel, michael.roth, mingo, peterz,
	pgonda, ross.lagerwall, sidtelang, tglx, thomas.lendacky, x86,
	ytcoode

On Sun, 10 Mar 2024 at 18:12, Kevin Loughlin <kevinloughlin@google.com> wrote:
>
> On Fri, Mar 8, 2024 at 3:44 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Fri, 8 Mar 2024 at 12:01, Borislav Petkov <bp@alien8.de> wrote:
> > >
> > > On Fri, Mar 08, 2024 at 11:30:50AM +0100, Ard Biesheuvel wrote:
> > > > Agree with the analysis and the conclusion. However, this will need to
> > > > be split into generic and x86 specific changes, given that the DMI
> > > > code is shared between all architectures, and explicitly checking for
> > > > SEV-SNP support in generic code is not appropriate.
> > > >
> > > > So what we will need is:
> > >
> > > I was actually thinking of:
> > >
> > >         x86_init.resources.probe_roms = snp_probe_roms;
> > >
> > > and snp_probe_roms() is an empty stub.
> > >
> > > Problem solved without ugly sprinkling of checks everywhere.
> > >
> >
> > Indeed. Setting the override could be done in
> > init_hypervisor_platform(), which is called right before from
> > setup_arch().
>
> The call to init_hypervisor_platform() has a comment saying it must
> come after dmi_setup() (i.e., init_hypervisor_platform() would *not*
> work for doing a dmi_setup() override), so I'm currently planning to
> do the overrides at the end of snp_init() in arch/x86/kernel/sev.c
> instead (which comes before both). This would be somewhat similar to
> how there are early setup functions for specific platforms that
> perform init overrides for different reasons (example:
> x86_ce4100_early_setup()). Open to other locations of course.

snp_init() is one of those routines that executes from the 1:1 early
mapping of memory.

Setting a global function pointer will therefore involve special
tricks to ensure that taking the address of this function will produce
an address that uses the correct translation.

So if we can, it would be better to put it somewhere else.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests
  2024-03-08 23:01                   ` Michael Roth
@ 2024-03-11 21:05                     ` Kevin Loughlin
  0 siblings, 0 replies; 24+ messages in thread
From: Kevin Loughlin @ 2024-03-11 21:05 UTC (permalink / raw)
  To: Michael Roth
  Cc: Mike Stunes, acdunlap, alexander.shishkin, andrisaar, ardb, bhe,
	Borislav Petkov, Brijesh Singh, dave.hansen, Dionna Amalie Glaze,
	grobler, hpa, jacobhxu, jpoimboe, Huang, Kai, linux-kernel,
	mingo, peterz, Peter Gonda, ross.lagerwall, sidtelang, tglx,
	Thomas Lendacky, x86, ytcoode

On Fri, Mar 8, 2024 at 3:01 PM Michael Roth <michael.roth@amd.com> wrote:
>
> On Fri, Mar 08, 2024 at 04:30:55PM -0500, Kevin Loughlin wrote:
> > On Fri, Mar 8, 2024 at 3:44 PM Michael Roth <michael.roth@amd.com> wrote:
> > >
> > > On Fri, Mar 08, 2024 at 11:10:43AM -0500, Kevin Loughlin wrote:
> > > > On Mon, Feb 26, 2024 at 2:16 PM Mike Stunes <mike.stunes@broadcom.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > > On Feb 22, 2024, at 12:24 PM, Kevin Loughlin <kevinloughlin@google.com> wrote:
> > > > > >
> > > > > > SEV-SNP requires encrypted memory to be validated before access.
> > > > > > Because the ROM memory range is not part of the e820 table, it is not
> > > > > > pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
> > > > > > to access this range, the guest must first validate the range.
> > > > > >
> > > > > > The current SEV-SNP code does indeed scan the ROM range during early
> > > > > > boot and thus attempts to validate the ROM range in probe_roms().
> > > > > > However, this behavior is neither necessary nor sufficient.
> > > > > >
> > > > > > With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
> > > > > > CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
> > > > > > attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> > > > > > falls in the ROM range) prior to validation. The specific problematic
> > > > > > call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> > > > > > in a crash during boot if SEV-SNP is enabled under these conditions.
> > > > > >
> > > > > > With regards to necessity, SEV-SNP guests currently read garbage (which
> > > > > > changes across boots) from the ROM range, meaning these scans are
> > > > > > unnecessary. The guest reads garbage because the legacy ROM range
> > > > > > is unencrypted data but is accessed via an encrypted PMD during early
> > > > > > boot (where the PMD is marked as encrypted due to potentially mapping
> > > > > > actually-encrypted data in other PMD-contained ranges).
> > > > > >
> > > > > > While one solution would be to overhaul the early PMD mapping to treat
> > > > > > the ROM region of the PMD as unencrypted, SEV-SNP guests do not rely on
> > > > > > data from the legacy ROM region during early boot (nor can they
> > > > > > currently, since the data would be garbage that changes across boots).
> > > > > > As such, this patch opts for the simpler approach of skipping the ROM
> > > > > > range scans (and the otherwise-necessary range validation) during
> > > > > > SEV-SNP guest early boot.
> > > > > >
> > > > > > Ultimatly, the potential SEV-SNP guest crash due to lack of ROM range
> > > > > > validation is avoided by simply not accessing the ROM range.
> > > > > >
> > > > > > Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
> > > > > > Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
> > > > > > ---
> > > > > > arch/x86/include/asm/sev.h   |  2 --
> > > > > > arch/x86/kernel/mpparse.c    |  7 +++++++
> > > > > > arch/x86/kernel/probe_roms.c | 11 ++++-------
> > > > > > arch/x86/kernel/sev.c        | 15 ---------------
> > > > > > drivers/firmware/dmi_scan.c  |  7 ++++++-
> > > > > > 5 files changed, 17 insertions(+), 25 deletions(-)
> > > > > >
> > > > > > diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> > > > > > index 5b4a1ce3d368..474c24ba0f6f 100644
> > > > > > --- a/arch/x86/include/asm/sev.h
> > > > > > +++ b/arch/x86/include/asm/sev.h
> > > > > > @@ -203,7 +203,6 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
> > > > > > unsigned long npages);
> > > > > > void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
> > > > > > unsigned long npages);
> > > > > > -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
> > > > > > void snp_set_memory_shared(unsigned long vaddr, unsigned long npages);
> > > > > > void snp_set_memory_private(unsigned long vaddr, unsigned long npages);
> > > > > > void snp_set_wakeup_secondary_cpu(void);
> > > > > > @@ -227,7 +226,6 @@ static inline void __init
> > > > > > early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> > > > > > static inline void __init
> > > > > > early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> > > > > > -static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
> > > > > > static inline void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) { }
> > > > > > static inline void snp_set_memory_private(unsigned long vaddr, unsigned long npages) { }
> > > > > > static inline void snp_set_wakeup_secondary_cpu(void) { }
> > > > > > diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
> > > > > > index b223922248e9..39ea771e2d4c 100644
> > > > > > --- a/arch/x86/kernel/mpparse.c
> > > > > > +++ b/arch/x86/kernel/mpparse.c
> > > > > > @@ -553,6 +553,13 @@ static int __init smp_scan_config(unsigned long base, unsigned long length)
> > > > > >    base, base + length - 1);
> > > > > > BUILD_BUG_ON(sizeof(*mpf) != 16);
> > > > > >
> > > > > > + /*
> > > > > > + * Skip scan in SEV-SNP guest if it would touch the legacy ROM region,
> > > > > > + * as this memory is not pre-validated and would thus cause a crash.
> > > > > > + */
> > > > > > + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && base < 0x100000 && base + length >= 0xC0000)
> > > > > > + return 0;
> > > > > > +
> > > > > > while (length > 0) {
> > > > > > bp = early_memremap(base, length);
> > > > > > mpf = (struct mpf_intel *)bp;
> > > > > > diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
> > > > > > index 319fef37d9dc..84ff4b052fc1 100644
> > > > > > --- a/arch/x86/kernel/probe_roms.c
> > > > > > +++ b/arch/x86/kernel/probe_roms.c
> > > > > > @@ -204,14 +204,11 @@ void __init probe_roms(void)
> > > > > > int i;
> > > > > >
> > > > > > /*
> > > > > > - * The ROM memory range is not part of the e820 table and is therefore not
> > > > > > - * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted
> > > > > > - * memory, and SNP requires encrypted memory to be validated before access.
> > > > > > - * Do that here.
> > > > > > + * These probes are skipped in SEV-SNP guests because the ROM range
> > > > > > + * is not pre-validated, meaning access would cause a crash.
> > > > > > */
> > > > > > - snp_prep_memory(video_rom_resource.start,
> > > > > > - ((system_rom_resource.end + 1) - video_rom_resource.start),
> > > > > > - SNP_PAGE_STATE_PRIVATE);
> > > > > > + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> > > > > > + return;
> > > > > >
> > > > > > /* video rom */
> > > > > > upper = adapter_rom_resources[0].start;
> > > > > > diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> > > > > > index c67285824e82..d2362631da91 100644
> > > > > > --- a/arch/x86/kernel/sev.c
> > > > > > +++ b/arch/x86/kernel/sev.c
> > > > > > @@ -774,21 +774,6 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
> > > > > > early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED);
> > > > > > }
> > > > > >
> > > > > > -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
> > > > > > -{
> > > > > > - unsigned long vaddr, npages;
> > > > > > -
> > > > > > - vaddr = (unsigned long)__va(paddr);
> > > > > > - npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
> > > > > > -
> > > > > > - if (op == SNP_PAGE_STATE_PRIVATE)
> > > > > > - early_snp_set_memory_private(vaddr, paddr, npages);
> > > > > > - else if (op == SNP_PAGE_STATE_SHARED)
> > > > > > - early_snp_set_memory_shared(vaddr, paddr, npages);
> > > > > > - else
> > > > > > - WARN(1, "invalid memory op %d\n", op);
> > > > > > -}
> > > > > > -
> > > > > > static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
> > > > > >       unsigned long vaddr_end, int op)
> > > > > > {
> > > > > > diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
> > > > > > index 015c95a825d3..22e27087eb5b 100644
> > > > > > --- a/drivers/firmware/dmi_scan.c
> > > > > > +++ b/drivers/firmware/dmi_scan.c
> > > > > > @@ -703,7 +703,12 @@ static void __init dmi_scan_machine(void)
> > > > > > dmi_available = 1;
> > > > > > return;
> > > > > > }
> > > > > > - } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK)) {
> > > > > > + } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK) &&
> > > > > > + !cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
> > > > > > + /*
> > > > > > + * This scan is skipped in SEV-SNP guests because the ROM range
> > > > > > + * is not pre-validated, meaning access would cause a crash.
> > > > > > + */
> > > > > > p = dmi_early_remap(SMBIOS_ENTRY_POINT_SCAN_START, 0x10000);
> > > > > > if (p == NULL)
> > > > > > goto error;
> > > > > > --
> > > > > > 2.44.0.rc0.258.g7320e95886-goog
> > > > > >
> > > > > >
> > > > >
> > > > > In addition to these changes, I also had to skip pirq_find_routing_table if SEV-SNP is active.
> > > >
> > > > Thanks. I will update this in v3.
> > >
> > > There's also another access a bit later in boot:
> > >
> > >   static __init int eisa_bus_probe(void)
> > >   {
> > >     ...
> > >     ioremap(0x0FFFD9, 4);
> > >   }
> > >
> > > This time it's via ioremap() with the encryption bit *unset*, so it
> > > won't necessarily cause a crash but it's inconsistent with the early
> > > page table having that region set as encrypted.
> > >
> > > We discussed unsetting the encryption bit in early page table with
> > > security folks and the general consensus was that *if* any VMM/firmware
> > > ever came along that does want to make use of legacy region for any reason
> > > (such as providing DMI/SMBIOS info) it would be safest to require that they
> > > encrypt the data in the region before handing off to guest kernel, so it
> > > makes sense to patch away unecrypted accesses to the legacy region so the
> > > don't cause problems down the road (like causing implicit page state
> > > change from private->shared and throwing away data in the region later
> > > in boot).
> >
> > Sounds good, thanks. Since this one won't cause crashes, I will place
> > it in a separate patch in the series to separate (current) functional
> > fixes from cleanup, especially since there may be similar legacy
> > probes to cleanup in various types of guests. Please let me know if
> > you feel differently or have additional thoughts.
>
> I think it could still be argued that it's a fix. It's just that the
> main set of fixes avoid reading garbage for VMM/firmwares that *don't*
> encrypt these regions, whereas this additional fix handles the case for
> VMM/firmwares that *do* encrypt these regions. It's possible they exist
> in the case of SEV (though I don't know of any). Might still make sense
> to distinguish the 2 cases since latter is more theoretical, but both
> still address the kernel modifying its behavior based on scanning random
> garbage for strings.

Fair enough. I will include it in the same patch in that case. Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v3] x86/kernel: skip ROM range scans and validation for SEV-SNP guests
  2024-03-11 10:43                   ` Ard Biesheuvel
@ 2024-03-13 12:15                     ` Kevin Loughlin
  2024-03-26 14:39                       ` [tip: x86/urgent] x86/sev: Skip " tip-bot2 for Kevin Loughlin
  0 siblings, 1 reply; 24+ messages in thread
From: Kevin Loughlin @ 2024-03-13 12:15 UTC (permalink / raw)
  To: ardb
  Cc: acdunlap, alexander.shishkin, andrisaar, bhe, bp, brijesh.singh,
	dave.hansen, dionnaglaze, grobler, hpa, jacobhxu, jpoimboe,
	kai.huang, kevinloughlin, linux-kernel, michael.roth, mingo,
	peterz, pgonda, ross.lagerwall, sidtelang, tglx, thomas.lendacky,
	x86, ytcoode

SEV-SNP requires encrypted memory to be validated before access.
Because the ROM memory range is not part of the e820 table, it is not
pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
to access this range, the guest must first validate the range.

The current SEV-SNP code does indeed scan the ROM range during early
boot and thus attempts to validate the ROM range in probe_roms().
However, this behavior is neither sufficient nor necessary for the
following reasons.

With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
falls in the ROM range) prior to validation. For example, Project Oak
Stage 0 provides a minimal guest firmware that currently meets these
configuration conditions, meaning guests booting atop Oak Stage 0
firmware encounter a problematic call chain during dmi_setup() ->
dmi_scan_machine() that results in a crash during boot if SEV-SNP is
enabled.

With regards to necessity, SEV-SNP guests generally read garbage (which
changes across boots) from the ROM range, meaning these scans are
unnecessary. The guest reads garbage because the legacy ROM range
is unencrypted data but is accessed via an encrypted PMD during early
boot (where the PMD is marked as encrypted due to potentially mapping
actually-encrypted data in other PMD-contained ranges).

In one exceptional case, EISA probing treats the ROM range as
unencrypted data, which is inconsistent with other probing.

Continuing to allow SEV-SNP guests to use garbage and to inconsistently
classify ROM range encryption status can trigger undesirable behavior.
For instance, if garbage bytes appear to be a valid signature, memory
may be unnecessarily reserved for the ROM range. Future code or other
use cases may result in more problematic (arbitrary) behavior that
should be avoided.

While one solution would be to overhaul the early PMD mapping to always
treat the ROM region of the PMD as unencrypted, SEV-SNP guests do not
currently rely on data from the ROM region during early boot (and even
if they did, they would be mostly relying on garbage data anyways).

As a simpler solution, skip the ROM range scans (and the otherwise-
necessary range validation) during SEV-SNP guest early boot. The
potential SEV-SNP guest crash due to lack of ROM range validation is
thus avoided by simply not accessing the ROM range.

In most cases, skip the scans by overriding problematic x86_init
functions during sme_early_init() to SNP-safe variants, which can be
likened to x86_init overrides done for other platforms (ex: Xen); such
overrides also avoid the spread of cc_platform_has() checks throughout
the tree. In the exceptional EISA case, still use cc_platform_has() for
the simplest change, given (1) checks for guest type (ex: Xen domain
status) are already performed here, and (2) these checks occur in a
subsys initcall instead of an x86_init function.

Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
---
 arch/x86/include/asm/sev.h      |  4 ++--
 arch/x86/include/asm/x86_init.h |  3 ++-
 arch/x86/kernel/eisa.c          |  3 ++-
 arch/x86/kernel/probe_roms.c    | 10 ----------
 arch/x86/kernel/setup.c         |  3 +--
 arch/x86/kernel/sev.c           | 27 ++++++++++++---------------
 arch/x86/kernel/x86_init.c      |  2 ++
 arch/x86/mm/mem_encrypt_amd.c   | 17 +++++++++++++++++
 8 files changed, 38 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 9477b4053bce..07e125f32528 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -218,12 +218,12 @@ void early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
 				  unsigned long npages);
 void early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
 				 unsigned long npages);
-void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
 void snp_set_memory_shared(unsigned long vaddr, unsigned long npages);
 void snp_set_memory_private(unsigned long vaddr, unsigned long npages);
 void snp_set_wakeup_secondary_cpu(void);
 bool snp_init(struct boot_params *bp);
 void __noreturn snp_abort(void);
+void snp_dmi_setup(void);
 int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct snp_guest_request_ioctl *rio);
 void snp_accept_memory(phys_addr_t start, phys_addr_t end);
 u64 snp_get_unsupported_features(u64 status);
@@ -244,12 +244,12 @@ static inline void __init
 early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
 static inline void __init
 early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
-static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
 static inline void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) { }
 static inline void snp_set_memory_private(unsigned long vaddr, unsigned long npages) { }
 static inline void snp_set_wakeup_secondary_cpu(void) { }
 static inline bool snp_init(struct boot_params *bp) { return false; }
 static inline void snp_abort(void) { }
+static inline void snp_dmi_setup(void) { }
 static inline int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct snp_guest_request_ioctl *rio)
 {
 	return -ENOTTY;
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index b89b40f250e6..6149eabe200f 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -30,12 +30,13 @@ struct x86_init_mpparse {
  * @reserve_resources:		reserve the standard resources for the
  *				platform
  * @memory_setup:		platform specific memory setup
- *
+ * @dmi_setup:			platform specific DMI setup
  */
 struct x86_init_resources {
 	void (*probe_roms)(void);
 	void (*reserve_resources)(void);
 	char *(*memory_setup)(void);
+	void (*dmi_setup)(void);
 };
 
 /**
diff --git a/arch/x86/kernel/eisa.c b/arch/x86/kernel/eisa.c
index e963344b0449..53935b4d62e3 100644
--- a/arch/x86/kernel/eisa.c
+++ b/arch/x86/kernel/eisa.c
@@ -2,6 +2,7 @@
 /*
  * EISA specific code
  */
+#include <linux/cc_platform.h>
 #include <linux/ioport.h>
 #include <linux/eisa.h>
 #include <linux/io.h>
@@ -12,7 +13,7 @@ static __init int eisa_bus_probe(void)
 {
 	void __iomem *p;
 
-	if (xen_pv_domain() && !xen_initial_domain())
+	if ((xen_pv_domain() && !xen_initial_domain()) || cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
 		return 0;
 
 	p = ioremap(0x0FFFD9, 4);
diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
index 319fef37d9dc..cc2c34ba7228 100644
--- a/arch/x86/kernel/probe_roms.c
+++ b/arch/x86/kernel/probe_roms.c
@@ -203,16 +203,6 @@ void __init probe_roms(void)
 	unsigned char c;
 	int i;
 
-	/*
-	 * The ROM memory range is not part of the e820 table and is therefore not
-	 * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted
-	 * memory, and SNP requires encrypted memory to be validated before access.
-	 * Do that here.
-	 */
-	snp_prep_memory(video_rom_resource.start,
-			((system_rom_resource.end + 1) - video_rom_resource.start),
-			SNP_PAGE_STATE_PRIVATE);
-
 	/* video rom */
 	upper = adapter_rom_resources[0].start;
 	for (start = video_rom_resource.start; start < upper; start += 2048) {
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 46d5a8c520ad..6064dffe590c 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -9,7 +9,6 @@
 #include <linux/console.h>
 #include <linux/crash_dump.h>
 #include <linux/dma-map-ops.h>
-#include <linux/dmi.h>
 #include <linux/efi.h>
 #include <linux/ima.h>
 #include <linux/init_ohci1394_dma.h>
@@ -902,7 +901,7 @@ void __init setup_arch(char **cmdline_p)
 		efi_init();
 
 	reserve_ibft_region();
-	dmi_setup();
+	x86_init.resources.dmi_setup();
 
 	/*
 	 * VMware detection requires dmi to be available, so this
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index b59b09c2f284..7e1e63cc48e6 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -23,6 +23,7 @@
 #include <linux/platform_device.h>
 #include <linux/io.h>
 #include <linux/psp-sev.h>
+#include <linux/dmi.h>
 #include <uapi/linux/sev-guest.h>
 
 #include <asm/init.h>
@@ -795,21 +796,6 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
 	early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED);
 }
 
-void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
-{
-	unsigned long vaddr, npages;
-
-	vaddr = (unsigned long)__va(paddr);
-	npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
-
-	if (op == SNP_PAGE_STATE_PRIVATE)
-		early_snp_set_memory_private(vaddr, paddr, npages);
-	else if (op == SNP_PAGE_STATE_SHARED)
-		early_snp_set_memory_shared(vaddr, paddr, npages);
-	else
-		WARN(1, "invalid memory op %d\n", op);
-}
-
 static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
 				       unsigned long vaddr_end, int op)
 {
@@ -2136,6 +2122,17 @@ void __head __noreturn snp_abort(void)
 	sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
 }
 
+/*
+ * SEV-SNP guests should only execute dmi_setup() if EFI_CONFIG_TABLES are
+ * enabled, as the alternative (fallback) logic for DMI probing in the legacy
+ * ROM region can cause a crash since this region is not pre-validated.
+ */
+void __init snp_dmi_setup(void)
+{
+	if (efi_enabled(EFI_CONFIG_TABLES))
+		dmi_setup();
+}
+
 static void dump_cpuid_table(void)
 {
 	const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table();
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index a42830dc151b..d5dc5a92635a 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -3,6 +3,7 @@
  *
  *  For licencing details see kernel-base/COPYING
  */
+#include <linux/dmi.h>
 #include <linux/init.h>
 #include <linux/ioport.h>
 #include <linux/export.h>
@@ -66,6 +67,7 @@ struct x86_init_ops x86_init __initdata = {
 		.probe_roms		= probe_roms,
 		.reserve_resources	= reserve_standard_io_resources,
 		.memory_setup		= e820__memory_setup_default,
+		.dmi_setup		= dmi_setup,
 	},
 
 	.mpparse = {
diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index 70b91de2e053..01e7feda6ed7 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -492,6 +492,23 @@ void __init sme_early_init(void)
 	 */
 	if (sev_status & MSR_AMD64_SEV_ENABLED)
 		ia32_disable();
+
+	/*
+	 * Override init functions that scan the ROM region in SEV-SNP guests,
+	 * as this memory is not pre-validated and would thus cause a crash.
+	 */
+	if (sev_status & MSR_AMD64_SEV_SNP_ENABLED) {
+		x86_init.mpparse.find_mptable = x86_init_noop;
+		x86_init.pci.init_irq = x86_init_noop;
+		x86_init.resources.probe_roms = x86_init_noop;
+
+		/*
+		 * DMI setup behavior for SEV-SNP guests depends on
+		 * efi_enabled(EFI_CONFIG_TABLES), which we haven't parsed yet.
+		 * snp_dmi_setup() will run after we have this information.
+		 */
+		x86_init.resources.dmi_setup = snp_dmi_setup;
+	}
 }
 
 void __init mem_encrypt_free_decrypted_mem(void)
-- 
2.44.0.278.ge034bb2e1d-goog


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [tip: x86/urgent] x86/sev: Skip ROM range scans and validation for SEV-SNP guests
  2024-03-13 12:15                     ` [PATCH v3] " Kevin Loughlin
@ 2024-03-26 14:39                       ` tip-bot2 for Kevin Loughlin
  0 siblings, 0 replies; 24+ messages in thread
From: tip-bot2 for Kevin Loughlin @ 2024-03-26 14:39 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Kevin Loughlin, Borislav Petkov (AMD), stable, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     0f4a1e80989aca185d955fcd791d7750082044a2
Gitweb:        https://git.kernel.org/tip/0f4a1e80989aca185d955fcd791d7750082044a2
Author:        Kevin Loughlin <kevinloughlin@google.com>
AuthorDate:    Wed, 13 Mar 2024 12:15:46 
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Tue, 26 Mar 2024 15:22:35 +01:00

x86/sev: Skip ROM range scans and validation for SEV-SNP guests

SEV-SNP requires encrypted memory to be validated before access.
Because the ROM memory range is not part of the e820 table, it is not
pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
to access this range, the guest must first validate the range.

The current SEV-SNP code does indeed scan the ROM range during early
boot and thus attempts to validate the ROM range in probe_roms().
However, this behavior is neither sufficient nor necessary for the
following reasons:

* With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
  CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
  attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
  falls in the ROM range) prior to validation.

  For example, Project Oak Stage 0 provides a minimal guest firmware
  that currently meets these configuration conditions, meaning guests
  booting atop Oak Stage 0 firmware encounter a problematic call chain
  during dmi_setup() -> dmi_scan_machine() that results in a crash
  during boot if SEV-SNP is enabled.

* With regards to necessity, SEV-SNP guests generally read garbage
  (which changes across boots) from the ROM range, meaning these scans
  are unnecessary. The guest reads garbage because the legacy ROM range
  is unencrypted data but is accessed via an encrypted PMD during early
  boot (where the PMD is marked as encrypted due to potentially mapping
  actually-encrypted data in other PMD-contained ranges).

In one exceptional case, EISA probing treats the ROM range as
unencrypted data, which is inconsistent with other probing.

Continuing to allow SEV-SNP guests to use garbage and to inconsistently
classify ROM range encryption status can trigger undesirable behavior.
For instance, if garbage bytes appear to be a valid signature, memory
may be unnecessarily reserved for the ROM range. Future code or other
use cases may result in more problematic (arbitrary) behavior that
should be avoided.

While one solution would be to overhaul the early PMD mapping to always
treat the ROM region of the PMD as unencrypted, SEV-SNP guests do not
currently rely on data from the ROM region during early boot (and even
if they did, they would be mostly relying on garbage data anyways).

As a simpler solution, skip the ROM range scans (and the otherwise-
necessary range validation) during SEV-SNP guest early boot. The
potential SEV-SNP guest crash due to lack of ROM range validation is
thus avoided by simply not accessing the ROM range.

In most cases, skip the scans by overriding problematic x86_init
functions during sme_early_init() to SNP-safe variants, which can be
likened to x86_init overrides done for other platforms (ex: Xen); such
overrides also avoid the spread of cc_platform_has() checks throughout
the tree.

In the exceptional EISA case, still use cc_platform_has() for the
simplest change, given (1) checks for guest type (ex: Xen domain status)
are already performed here, and (2) these checks occur in a subsys
initcall instead of an x86_init function.

  [ bp: Massage commit message, remove "we"s. ]

Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20240313121546.2964854-1-kevinloughlin@google.com
---
 arch/x86/include/asm/sev.h      |  4 ++--
 arch/x86/include/asm/x86_init.h |  3 ++-
 arch/x86/kernel/eisa.c          |  3 ++-
 arch/x86/kernel/probe_roms.c    | 10 ----------
 arch/x86/kernel/setup.c         |  3 +--
 arch/x86/kernel/sev.c           | 27 ++++++++++++---------------
 arch/x86/kernel/x86_init.c      |  2 ++
 arch/x86/mm/mem_encrypt_amd.c   | 18 ++++++++++++++++++
 8 files changed, 39 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 9477b40..07e125f 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -218,12 +218,12 @@ void early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
 				  unsigned long npages);
 void early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
 				 unsigned long npages);
-void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
 void snp_set_memory_shared(unsigned long vaddr, unsigned long npages);
 void snp_set_memory_private(unsigned long vaddr, unsigned long npages);
 void snp_set_wakeup_secondary_cpu(void);
 bool snp_init(struct boot_params *bp);
 void __noreturn snp_abort(void);
+void snp_dmi_setup(void);
 int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct snp_guest_request_ioctl *rio);
 void snp_accept_memory(phys_addr_t start, phys_addr_t end);
 u64 snp_get_unsupported_features(u64 status);
@@ -244,12 +244,12 @@ static inline void __init
 early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
 static inline void __init
 early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
-static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
 static inline void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) { }
 static inline void snp_set_memory_private(unsigned long vaddr, unsigned long npages) { }
 static inline void snp_set_wakeup_secondary_cpu(void) { }
 static inline bool snp_init(struct boot_params *bp) { return false; }
 static inline void snp_abort(void) { }
+static inline void snp_dmi_setup(void) { }
 static inline int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct snp_guest_request_ioctl *rio)
 {
 	return -ENOTTY;
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index b89b40f..6149eab 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -30,12 +30,13 @@ struct x86_init_mpparse {
  * @reserve_resources:		reserve the standard resources for the
  *				platform
  * @memory_setup:		platform specific memory setup
- *
+ * @dmi_setup:			platform specific DMI setup
  */
 struct x86_init_resources {
 	void (*probe_roms)(void);
 	void (*reserve_resources)(void);
 	char *(*memory_setup)(void);
+	void (*dmi_setup)(void);
 };
 
 /**
diff --git a/arch/x86/kernel/eisa.c b/arch/x86/kernel/eisa.c
index e963344..53935b4 100644
--- a/arch/x86/kernel/eisa.c
+++ b/arch/x86/kernel/eisa.c
@@ -2,6 +2,7 @@
 /*
  * EISA specific code
  */
+#include <linux/cc_platform.h>
 #include <linux/ioport.h>
 #include <linux/eisa.h>
 #include <linux/io.h>
@@ -12,7 +13,7 @@ static __init int eisa_bus_probe(void)
 {
 	void __iomem *p;
 
-	if (xen_pv_domain() && !xen_initial_domain())
+	if ((xen_pv_domain() && !xen_initial_domain()) || cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
 		return 0;
 
 	p = ioremap(0x0FFFD9, 4);
diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
index 319fef3..cc2c34b 100644
--- a/arch/x86/kernel/probe_roms.c
+++ b/arch/x86/kernel/probe_roms.c
@@ -203,16 +203,6 @@ void __init probe_roms(void)
 	unsigned char c;
 	int i;
 
-	/*
-	 * The ROM memory range is not part of the e820 table and is therefore not
-	 * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted
-	 * memory, and SNP requires encrypted memory to be validated before access.
-	 * Do that here.
-	 */
-	snp_prep_memory(video_rom_resource.start,
-			((system_rom_resource.end + 1) - video_rom_resource.start),
-			SNP_PAGE_STATE_PRIVATE);
-
 	/* video rom */
 	upper = adapter_rom_resources[0].start;
 	for (start = video_rom_resource.start; start < upper; start += 2048) {
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index ef20650..0109e6c 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -9,7 +9,6 @@
 #include <linux/console.h>
 #include <linux/crash_dump.h>
 #include <linux/dma-map-ops.h>
-#include <linux/dmi.h>
 #include <linux/efi.h>
 #include <linux/ima.h>
 #include <linux/init_ohci1394_dma.h>
@@ -902,7 +901,7 @@ void __init setup_arch(char **cmdline_p)
 		efi_init();
 
 	reserve_ibft_region();
-	dmi_setup();
+	x86_init.resources.dmi_setup();
 
 	/*
 	 * VMware detection requires dmi to be available, so this
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index b59b09c..7e1e63c 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -23,6 +23,7 @@
 #include <linux/platform_device.h>
 #include <linux/io.h>
 #include <linux/psp-sev.h>
+#include <linux/dmi.h>
 #include <uapi/linux/sev-guest.h>
 
 #include <asm/init.h>
@@ -795,21 +796,6 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
 	early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED);
 }
 
-void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
-{
-	unsigned long vaddr, npages;
-
-	vaddr = (unsigned long)__va(paddr);
-	npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
-
-	if (op == SNP_PAGE_STATE_PRIVATE)
-		early_snp_set_memory_private(vaddr, paddr, npages);
-	else if (op == SNP_PAGE_STATE_SHARED)
-		early_snp_set_memory_shared(vaddr, paddr, npages);
-	else
-		WARN(1, "invalid memory op %d\n", op);
-}
-
 static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
 				       unsigned long vaddr_end, int op)
 {
@@ -2136,6 +2122,17 @@ void __head __noreturn snp_abort(void)
 	sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
 }
 
+/*
+ * SEV-SNP guests should only execute dmi_setup() if EFI_CONFIG_TABLES are
+ * enabled, as the alternative (fallback) logic for DMI probing in the legacy
+ * ROM region can cause a crash since this region is not pre-validated.
+ */
+void __init snp_dmi_setup(void)
+{
+	if (efi_enabled(EFI_CONFIG_TABLES))
+		dmi_setup();
+}
+
 static void dump_cpuid_table(void)
 {
 	const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table();
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index a42830d..d5dc5a9 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -3,6 +3,7 @@
  *
  *  For licencing details see kernel-base/COPYING
  */
+#include <linux/dmi.h>
 #include <linux/init.h>
 #include <linux/ioport.h>
 #include <linux/export.h>
@@ -66,6 +67,7 @@ struct x86_init_ops x86_init __initdata = {
 		.probe_roms		= probe_roms,
 		.reserve_resources	= reserve_standard_io_resources,
 		.memory_setup		= e820__memory_setup_default,
+		.dmi_setup		= dmi_setup,
 	},
 
 	.mpparse = {
diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index 70b91de..422602f 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -492,6 +492,24 @@ void __init sme_early_init(void)
 	 */
 	if (sev_status & MSR_AMD64_SEV_ENABLED)
 		ia32_disable();
+
+	/*
+	 * Override init functions that scan the ROM region in SEV-SNP guests,
+	 * as this memory is not pre-validated and would thus cause a crash.
+	 */
+	if (sev_status & MSR_AMD64_SEV_SNP_ENABLED) {
+		x86_init.mpparse.find_mptable = x86_init_noop;
+		x86_init.pci.init_irq = x86_init_noop;
+		x86_init.resources.probe_roms = x86_init_noop;
+
+		/*
+		 * DMI setup behavior for SEV-SNP guests depends on
+		 * efi_enabled(EFI_CONFIG_TABLES), which hasn't been
+		 * parsed yet. snp_dmi_setup() will run after that
+		 * parsing has happened.
+		 */
+		x86_init.resources.dmi_setup = snp_dmi_setup;
+	}
 }
 
 void __init mem_encrypt_free_decrypted_mem(void)

^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2024-03-26 14:39 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-13  4:07 [PATCH] x86/kernel: Validate ROM before DMI scanning when SEV-SNP is active Kevin Loughlin
2024-02-13 20:02 ` Michael Roth
2024-02-13 23:10   ` Kevin Loughlin
2024-02-16 22:50     ` Michael Roth
2024-02-21 22:50       ` Kevin Loughlin
2024-02-22 20:20         ` Michael Roth
2024-02-22 20:24         ` [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests Kevin Loughlin
2024-02-26 19:15           ` Mike Stunes
2024-03-08 16:10             ` Kevin Loughlin
2024-03-08 20:41               ` Michael Roth
2024-03-08 21:30                 ` Kevin Loughlin
2024-03-08 23:01                   ` Michael Roth
2024-03-11 21:05                     ` Kevin Loughlin
2024-02-29 16:54           ` Borislav Petkov
2024-03-08 16:14             ` Kevin Loughlin
2024-03-08 10:30           ` Ard Biesheuvel
2024-03-08 11:00             ` Borislav Petkov
2024-03-08 11:44               ` Ard Biesheuvel
2024-03-10 17:12                 ` Kevin Loughlin
2024-03-11 10:43                   ` Ard Biesheuvel
2024-03-13 12:15                     ` [PATCH v3] " Kevin Loughlin
2024-03-26 14:39                       ` [tip: x86/urgent] x86/sev: Skip " tip-bot2 for Kevin Loughlin
2024-03-08 20:52               ` [PATCH v2] x86/kernel: skip " Kevin Loughlin
2024-03-08 20:50             ` Kevin Loughlin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.