From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.7 required=3.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FSL_HELO_FAKE,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30F19C47E4D for ; Wed, 14 Jul 2021 21:07:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B449A613CC for ; Wed, 14 Jul 2021 21:07:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B449A613CC Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 06CDD8D0001; Wed, 14 Jul 2021 17:07:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 01C8A6B0095; Wed, 14 Jul 2021 17:07:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D63108D0001; Wed, 14 Jul 2021 17:07:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0154.hostedemail.com [216.40.44.154]) by kanga.kvack.org (Postfix) with ESMTP id A1F966B0092 for ; Wed, 14 Jul 2021 17:07:45 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 030001A4A7 for ; Wed, 14 Jul 2021 21:07:43 +0000 (UTC) X-FDA: 78362430006.30.F13D5C5 Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) by imf08.hostedemail.com (Postfix) with ESMTP id B1C2730000B1 for ; Wed, 14 Jul 2021 21:07:42 +0000 (UTC) Received: by mail-pl1-f173.google.com with SMTP id x16so2097046plg.3 for ; Wed, 14 Jul 2021 14:07:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=OrJjmDihky9L5lychbpTBZwL1+xpeDWHYbd6hVXreJM=; b=tlz5bGQLSoLrdG3yHlzIPTA4kenOD+40ek2Xmm5ZcZAuRAmgi1szyK30dlHoMqPjr1 HC6+ZapUN1ng1N8iPi3Xx+1Wk6Bzzl+zci2pX/dkzPsyvzPtUD5PHfZ6PZiaVwQ660uR uXhtnHkr98V4781RINe/3kERQ//Xiv4ZIKUkPHx8R00NMWzEOkyqBnPzIPRQIrmjR2JR AW0GQ2ltDtzB/uumhtBlcS22Ycv15zYhmDnK2LyN0oSlspTzcLWNdE2QdVVAmrvCsv7s pCi4Qnr/JcI9IbckJWsASScJV6y7KBIAdxBYawEZyIeQg1TB8oc5D6AWSh5Kkcq749bg HXbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=OrJjmDihky9L5lychbpTBZwL1+xpeDWHYbd6hVXreJM=; b=HMhdji/jZVOzvklKAbiVFAUHdHrJtSOsO6vQMAKXIdPF8/6ZXUM5X6/QSORenYqyfX edBMMYiI9X4Fa0G/Y3szFX2YIUu6J9vBdLC6rdmAUhZywl3nEa1UU/g2+qkExk7Y5rpM V3t7YC8UPkgL7Zy4KMVjkzLoaAovggZGMfxTic9t7JAYHyTwek+puSH4lSAsy3Z3l6PB lew6AhS+aMjeOy/J5EvtnJXmf6EGzLW1KnnuqJ6th0sgUaoLdQM4t0pNKKyuZHnzjzP/ n1Lr1jw2xNdS6wWCCWLY7QdnhDYb2NiXzU4F4a2CXzGGP5HG7OxAIMpc2uR7jQk+7+4G 13qQ== X-Gm-Message-State: AOAM531UvHGBUWWi04XnaauyhDe9mP7SZ98FNMP6EhvwAWuuowHD3ATo Pl9+oi0s94o5NVLdWPw33gzhUQ== X-Google-Smtp-Source: ABdhPJy6ZvJ7eEKoQQlA+dc27/Ymn/phT2UWrJkO7pxjQvI0kRR62Q2ZaSp9qAhwtDJisSV5ZIWr7g== X-Received: by 2002:a17:90a:c092:: with SMTP id o18mr5529982pjs.3.1626296861177; Wed, 14 Jul 2021 14:07:41 -0700 (PDT) Received: from google.com (157.214.185.35.bc.googleusercontent.com. [35.185.214.157]) by smtp.gmail.com with ESMTPSA id m21sm3758645pfo.159.2021.07.14.14.07.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jul 2021 14:07:40 -0700 (PDT) Date: Wed, 14 Jul 2021 21:07:37 +0000 From: Sean Christopherson To: Brijesh Singh Cc: x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, platform-driver-x86@vger.kernel.org, linux-coco@lists.linux.dev, linux-mm@kvack.org, linux-crypto@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Joerg Roedel , Tom Lendacky , "H. Peter Anvin" , Ard Biesheuvel , Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Andy Lutomirski , Dave Hansen , Sergio Lopez , Peter Gonda , Peter Zijlstra , Srinivas Pandruvada , David Rientjes , Dov Murik , Tobin Feldman-Fitzthum , Borislav Petkov , Michael Roth , Vlastimil Babka , tony.luck@intel.com, npmccallum@redhat.com, brijesh.ksingh@gmail.com Subject: Re: [PATCH Part2 RFC v4 04/40] x86/sev: Add the host SEV-SNP initialization support Message-ID: References: <20210707183616.5620-1-brijesh.singh@amd.com> <20210707183616.5620-5-brijesh.singh@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20210707183616.5620-5-brijesh.singh@amd.com> X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: B1C2730000B1 X-Stat-Signature: dkm9hrbnk1zhhejexk1nb9qj1nsbm44i Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20161025 header.b=tlz5bGQL; spf=pass (imf08.hostedemail.com: domain of seanjc@google.com designates 209.85.214.173 as permitted sender) smtp.mailfrom=seanjc@google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1626296862-574908 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jul 07, 2021, Brijesh Singh wrote: > diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c > index aa7e37631447..f9d813d498fa 100644 > --- a/arch/x86/kernel/sev.c > +++ b/arch/x86/kernel/sev.c > @@ -24,6 +24,8 @@ > #include > #include > #include > +#include > +#include > =20 > #include > #include > @@ -40,11 +42,14 @@ > #include > #include > #include > +#include > =20 > #include "sev-internal.h" > =20 > #define DR7_RESET_VALUE 0x400 > =20 > +#define RMPTABLE_ENTRIES_OFFSET 0x4000 A comment and/or blurb in the changelog describing this magic number woul= d be quite helpful. And maybe call out that this is for the bookkeeping, e.g. #define RMPTABLE_CPU_BOOKKEEPING_SIZE 0x4000 Also, the APM doesn't actually state the exact location of the bookkeepin= g region, it only states that it's somewhere between RMP_BASE and RMP_END. = This seems to imply that the bookkeeping region is always at RMP_BASE? The region of memory between RMP_BASE and RMP_END contains a 16KB regio= n used for processor bookkeeping followed by the RMP entries, which are each 1= 6B in size. The size of the RMP determines the range of physical memory that = the hypervisor can assign to SNP-active virtual machines at runtime. The RM= P covers the system physical address space from address 0h to the address calcul= ated by: ((RMP_END + 1 =E2=80=93 RMP_BASE =E2=80=93 16KB) / 16B) x 4KB > /* For early boot hypervisor communication in SEV-ES enabled guests */ > static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE)= ; > =20 > @@ -56,6 +61,9 @@ static struct ghcb __initdata *boot_ghcb; > =20 > static u64 snp_secrets_phys; > =20 > +static unsigned long rmptable_start __ro_after_init; > +static unsigned long rmptable_end __ro_after_init; > + > /* #VC handler runtime per-CPU data */ > struct sev_es_runtime_data { > struct ghcb ghcb_page; > @@ -2176,3 +2184,138 @@ static int __init add_snp_guest_request(void) > return 0; > } > device_initcall(add_snp_guest_request); > + > +#undef pr_fmt > +#define pr_fmt(fmt) "SEV-SNP: " fmt > + > +static int __snp_enable(unsigned int cpu) > +{ > + u64 val; > + > + if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP)) > + return 0; > + > + rdmsrl(MSR_AMD64_SYSCFG, val); > + > + val |=3D MSR_AMD64_SYSCFG_SNP_EN; > + val |=3D MSR_AMD64_SYSCFG_SNP_VMPL_EN; Is VMPL required? Do we plan on using VMPL out of the gate? > + > + wrmsrl(MSR_AMD64_SYSCFG, val); > + > + return 0; > +} > + > +static __init void snp_enable(void *arg) > +{ > + __snp_enable(smp_processor_id()); > +} > + > +static bool get_rmptable_info(u64 *start, u64 *len) > +{ > + u64 calc_rmp_sz, rmp_sz, rmp_base, rmp_end, nr_pages; > + > + rdmsrl(MSR_AMD64_RMP_BASE, rmp_base); > + rdmsrl(MSR_AMD64_RMP_END, rmp_end); > + > + if (!rmp_base || !rmp_end) { Can BIOS put the RMP at PA=3D0? Also, why is it a BIOS decision? AFAICT, the MSRs aren't locked until SN= P_EN is set in SYSCFG, and that appears to be a kernel decision (ignoring kexe= c), i.e. nothing would prevent the kernel from configuring it's own RMP. > + pr_info("Memory for the RMP table has not been reserved by BIOS\n"); > + return false; > + } > + > + rmp_sz =3D rmp_end - rmp_base + 1; > + > + /* > + * Calculate the amount the memory that must be reserved by the BIOS = to > + * address the full system RAM. The reserved memory should also cover= the > + * RMP table itself. > + * > + * See PPR section 2.1.5.2 for more information on memory requirement= . > + */ > + nr_pages =3D totalram_pages(); > + calc_rmp_sz =3D (((rmp_sz >> PAGE_SHIFT) + nr_pages) << 4) + RMPTABLE= _ENTRIES_OFFSET; > + > + if (calc_rmp_sz > rmp_sz) { > + pr_info("Memory reserved for the RMP table does not cover the full s= ystem " > + "RAM (expected 0x%llx got 0x%llx)\n", calc_rmp_sz, rmp_sz); Is BIOS expected to provide exact coverage, e.g. should this be s/expecte= d/need? Should the kernel also sanity check other requirements, e.g. the 8kb alig= nment, or does the CPU enforce those things at WRMSR? > + return false; > + } > + > + *start =3D rmp_base; > + *len =3D rmp_sz; > + > + pr_info("RMP table physical address 0x%016llx - 0x%016llx\n", rmp_bas= e, rmp_end); > + > + return true; > +} > + > +static __init int __snp_rmptable_init(void) > +{ > + u64 rmp_base, sz; > + void *start; > + u64 val; > + > + if (!get_rmptable_info(&rmp_base, &sz)) > + return 1; > + > + start =3D memremap(rmp_base, sz, MEMREMAP_WB); > + if (!start) { > + pr_err("Failed to map RMP table 0x%llx+0x%llx\n", rmp_base, sz); > + return 1; > + } > + > + /* > + * Check if SEV-SNP is already enabled, this can happen if we are com= ing from > + * kexec boot. > + */ > + rdmsrl(MSR_AMD64_SYSCFG, val); > + if (val & MSR_AMD64_SYSCFG_SNP_EN) Hmm, it kinda feels like there should be a sanity check for the case wher= e SNP is already enabled but get_rmptable_info() fails, e.g. due to insufficient R= MP size. > + goto skip_enable; > + > + /* Initialize the RMP table to zero */ > + memset(start, 0, sz); > + > + /* Flush the caches to ensure that data is written before SNP is enab= led. */ > + wbinvd_on_all_cpus(); > + > + /* Enable SNP on all CPUs. */ > + on_each_cpu(snp_enable, NULL, 1); > + > +skip_enable: > + rmptable_start =3D (unsigned long)start; Mostly out of curiosity, why store start/end as unsigned longs? This is = all 64-bit only so it doesn't actually affect the code generation, but it feels odd = to store things that absolutely have to be 64-bit values as unsigned long. Similar question for why asm/sev-common.h cases to unsigned long instead = of u64. E.g. the below in particular looks wrong because we're shifting an unsign= ed long b y32 bits, i.e. the value _must_ be a 64-bit value, why obfuscate that? #define GHCB_CPUID_REQ(fn, reg) \ (GHCB_MSR_CPUID_REQ | \ (((unsigned long)reg & GHCB_MSR_CPUID_REG_MASK) << GHCB_MSR_CPUID_REG_P= OS) | \ (((unsigned long)fn) << GHCB_MSR_CPUID_FUNC_POS)) > + rmptable_end =3D rmptable_start + sz; > + > + return 0; > +} > + > +static int __init snp_rmptable_init(void) > +{ > + if (!boot_cpu_has(X86_FEATURE_SEV_SNP)) > + return 0; > + > + /* > + * The SEV-SNP support requires that IOMMU must be enabled, and is no= t > + * configured in the passthrough mode. > + */ > + if (no_iommu || iommu_default_passthrough()) { Similar comment regarding the sanity check, kexec'ing into a kernel with = SNP already enabled should probably fail explicitly if the new kernel is boot= ed with incompatible params. > + setup_clear_cpu_cap(X86_FEATURE_SEV_SNP); > + pr_err("IOMMU is either disabled or configured in passthrough mode.\= n"); > + return 0; > + } > + > + if (__snp_rmptable_init()) { > + setup_clear_cpu_cap(X86_FEATURE_SEV_SNP); > + return 1; > + } > + > + cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/rmptable_init:online", __= snp_enable, NULL); > + > + return 0; > +} > + > +/* > + * This must be called after the PCI subsystem. This is because before= enabling > + * the SNP feature we need to ensure that IOMMU is not configured in t= he > + * passthrough mode. The iommu_default_passthrough() is used for check= ing the > + * passthough state, and it is available after subsys_initcall(). > + */ > +fs_initcall(snp_rmptable_init); > --=20 > 2.17.1 >=20