On Tue, Jun 14, 2022, Tom Lendacky wrote: > On 6/14/22 10:43, Sean Christopherson wrote: > > On Tue, Jun 14, 2022, Sean Christopherson wrote: > > > s/Brijesh/Michael > > > > > > On Mon, Mar 07, 2022, Brijesh Singh wrote: > > > > The encryption attribute for the .bss..decrypted section is cleared in the > > > > initial page table build. This is because the section contains the data > > > > that need to be shared between the guest and the hypervisor. > > > > > > > > When SEV-SNP is active, just clearing the encryption attribute in the > > > > page table is not enough. The page state need to be updated in the RMP > > > > table. > > > > > > > > Signed-off-by: Brijesh Singh > > > > --- > > > > arch/x86/kernel/head64.c | 13 +++++++++++++ > > > > 1 file changed, 13 insertions(+) > > > > > > > > diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c > > > > index 83514b9827e6..656d2f3e2cf0 100644 > > > > --- a/arch/x86/kernel/head64.c > > > > +++ b/arch/x86/kernel/head64.c > > > > @@ -143,7 +143,20 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdv > > > > if (sme_get_me_mask()) { > > > > vaddr = (unsigned long)__start_bss_decrypted; > > > > vaddr_end = (unsigned long)__end_bss_decrypted; > > > > + > > > > for (; vaddr < vaddr_end; vaddr += PMD_SIZE) { > > > > + /* > > > > + * On SNP, transition the page to shared in the RMP table so that > > > > + * it is consistent with the page table attribute change. > > > > + * > > > > + * __start_bss_decrypted has a virtual address in the high range > > > > + * mapping (kernel .text). PVALIDATE, by way of > > > > + * early_snp_set_memory_shared(), requires a valid virtual > > > > + * address but the kernel is currently running off of the identity > > > > + * mapping so use __pa() to get a *currently* valid virtual address. > > > > + */ > > > > + early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD); > > > > > > This breaks SME on Rome and Milan when compiling with clang-13. I haven't been > > > able to figure out exactly what goes wrong. printk isn't functional at this point, > > > and interactive debug during boot on our test systems is beyond me. I can't even > > > verify that the bug is specific to clang because the draconian build system for our > > > test systems apparently is stuck pointing at gcc-4.9. > > > > > > I suspect the issue is related to relocation and/or encrypting memory, as skipping > > > the call to early_snp_set_memory_shared() if SNP isn't active masks the issue. > > > I've dug through the assembly and haven't spotted a smoking gun, e.g. no obvious > > > use of absolute addresses. > > > > > > Forcing a VM through the same path doesn't fail. I can't test an SEV guest at the > > > moment because INIT_EX is also broken. > > > > The SEV INIT_EX was a PEBKAC issue. An SEV guest boots just fine with a clang-built > > kernel, so either it's a finnicky relocation issue or something specific to SME. > > I just built and booted 5.19-rc2 with clang-13 and SME enabled without issue: > > [ 4.118226] Memory Encryption Features active: AMD SME Phooey. > Maybe something with your kernel config? Can you send me your config? Attached. If you can't repro, I'll find someone on our end to work on this. Thanks!