From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C86AC433EF for ; Fri, 1 Apr 2022 17:29:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348986AbiDARbE (ORCPT ); Fri, 1 Apr 2022 13:31:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48288 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347621AbiDARbA (ORCPT ); Fri, 1 Apr 2022 13:31:00 -0400 Received: from mail-oa1-x2b.google.com (mail-oa1-x2b.google.com [IPv6:2001:4860:4864:20::2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DCDAF16DB6E for ; Fri, 1 Apr 2022 10:29:10 -0700 (PDT) Received: by mail-oa1-x2b.google.com with SMTP id 586e51a60fabf-d6ca46da48so3386862fac.12 for ; Fri, 01 Apr 2022 10:29:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=WraNq3noA/oHiDCjdg83h3Em4sODbXuxqxGa7fOH8ng=; b=EsSECRyZjkVgBr1ClmR85DHg4N2nM7LFSOrbUSP+8C949qfg6GkveD5ES9E3SShZyZ l4IPUKyvTHVumFZ9Dx8c2Luhu6rpCOrogAIXZ802DTlPka7u7JKOfjUljz/BAuPdRO0x 4Dx+gUlLKfDXSx2xVax3IkWYAy5V/cGSFOTK4dF1U80HifkUuGn0HWjfamK0JDGcBWM3 Ba1L3XJpeR0xUZp61b3NslEhzWr+Y+33udJpb8HxjqcIjm2FNp8DihAabvqyCvze7b1N DusTQPre2DAIwUnKPYtCOWP0BS5ej/2M4GLYLKDgvgDq7cuq/+jceqBMy8B2vNCI5Tki IAVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=WraNq3noA/oHiDCjdg83h3Em4sODbXuxqxGa7fOH8ng=; b=CuKgCXkRdVDh2kdor8ikmEcZHDVNU5jPtyrKmzvKpdaSA5kk+4EHMRsXV2bu2jsjmo h1sOn1XC+F/ZjtEU/gXMjYqdJsevi9oaG532uJpLtmmYaqkX7qJgC9tykdaa5nxmi4Me MBJHYs0GHvMTJwcjq1eFH0Ywaob6DBytsmtyb3l2E/44LQBvVtmS1a7fC7/rQeK/1sBe jgs0z+y7TF7xChck7gGDcedZ4/JnrWNIE5Yg4Hrq9iGkmbFVK/X/aCIpQnabK4Befz2a EqN9Z4ve8EuN/siS3bz9MoG19hUS0Eha0G93XVD26D3vvNj8SydfwZ9SYBIIj9oCawzC bqZg== X-Gm-Message-State: AOAM533xLboLC1pNgbkqMG4UFaPGqtNydBYTq4/xJrnXUuWfcKTwd8m0 yF0o4VaZtYBMBoNatV8wLXxQuAqHUz7iyTN/i3YsFQ== X-Google-Smtp-Source: ABdhPJznFK2GQmOfuSdixOKziz2XOo/OGu52BVauEjLHcwnKrfyjITdvyw8UBW3cpa+rXmCw1sWOkZtnIbQUQLaG8SE= X-Received: by 2002:a05:6870:40cc:b0:de:15e7:4df0 with SMTP id l12-20020a05687040cc00b000de15e74df0mr5635386oal.110.1648834149772; Fri, 01 Apr 2022 10:29:09 -0700 (PDT) MIME-Version: 1.0 References: <20220308043857.13652-1-nikunj@amd.com> <5567f4ec-bbcf-4caf-16c1-3621b77a1779@amd.com> In-Reply-To: From: Marc Orr Date: Fri, 1 Apr 2022 10:28:58 -0700 Message-ID: Subject: Re: [PATCH RFC v1 0/9] KVM: SVM: Defer page pinning for SEV guests To: Sean Christopherson Cc: Peter Gonda , "Nikunj A. Dadhania" , Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Brijesh Singh , Tom Lendacky , Bharata B Rao , "Maciej S . Szmigiero" , Mingwei Zhang , David Hildenbrand , kvm list , LKML Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 31, 2022 at 12:01 PM Sean Christopherson wrote: > > On Thu, Mar 31, 2022, Peter Gonda wrote: > > On Wed, Mar 30, 2022 at 10:48 PM Nikunj A. Dadhania wrote: > > > On 3/31/2022 1:17 AM, Sean Christopherson wrote: > > > > On Wed, Mar 30, 2022, Nikunj A. Dadhania wrote: > > > >> On 3/29/2022 2:30 AM, Sean Christopherson wrote: > > > >>> Let me preface this by saying I generally like the idea and especially the > > > >>> performance, but... > > > >>> > > > >>> I think we should abandon this approach in favor of committing all our resources > > > >>> to fd-based private memory[*], which (if done right) will provide on-demand pinning > > > >>> for "free". > > > >> > > > >> I will give this a try for SEV, was on my todo list. > > > >> > > > >>> I would much rather get that support merged sooner than later, and use > > > >>> it as a carrot for legacy SEV to get users to move over to its new APIs, with a long > > > >>> term goal of deprecating and disallowing SEV/SEV-ES guests without fd-based private > > > >>> memory. > > > >> > > > >>> That would require guest kernel support to communicate private vs. shared, > > > >> > > > >> Could you explain this in more detail? This is required for punching hole for shared pages? > > > > > > > > Unlike SEV-SNP, which enumerates private vs. shared in the error code, SEV and SEV-ES > > > > don't provide private vs. shared information to the host (KVM) on page fault. And > > > > it's even more fundamental then that, as SEV/SEV-ES won't even fault if the guest > > > > accesses the "wrong" GPA variant, they'll silent consume/corrupt data. > > > > > > > > That means KVM can't support implicit conversions for SEV/SEV-ES, and so an explicit > > > > hypercall is mandatory. SEV doesn't even have a vendor-agnostic guest/host paravirt > > > > ABI, and IIRC SEV-ES doesn't provide a conversion/map hypercall in the GHCB spec, so > > > > running a SEV/SEV-ES guest under UPM would require the guest firmware+kernel to be > > > > properly enlightened beyond what is required architecturally. > > > > > > > > > > So with guest supporting KVM_FEATURE_HC_MAP_GPA_RANGE and host (KVM) supporting > > > KVM_HC_MAP_GPA_RANGE hypercall, SEV/SEV-ES guest should communicate private/shared > > > pages to the hypervisor, this information can be used to mark page shared/private. > > > > One concern here may be that the VMM doesn't know which guests have > > KVM_FEATURE_HC_MAP_GPA_RANGE support and which don't. Only once the > > guest boots does the guest tell KVM that it supports > > KVM_FEATURE_HC_MAP_GPA_RANGE. If the guest doesn't we need to pin all > > the memory before we run the guest to be safe to be safe. > > Yep, that's a big reason why I view purging the existing SEV memory management as > a long term goal. The other being that userspace obviously needs to be updated to > support UPM[*]. I suspect the only feasible way to enable this for SEV/SEV-ES > would be to restrict it to new VM types that have a disclaimer regarding additional > requirements. > > [*] I believe Peter coined the UPM acronym for "Unmapping guest Private Memory". We've > been using it iternally for discussion and it rolls off the tongue a lot easier than > the full phrase, and is much more precise/descriptive than just "private fd". Can we really "purge the existing SEV memory management"? This seems like a non-starter because it violates userspace API (i.e., the ability for the userspace VMM to run a guest without KVM_FEATURE_HC_MAP_GPA_RANGE). Or maybe I'm not quite following what you mean by purge. Assuming that UPM-based lazy pinning comes together via a new VM type that only supports new images based on a minimum kernel version with KVM_FEATURE_HC_MAP_GPA_RANGE, then I think this would like as follows: 1. Userspace VMM: Check SEV VM type. If type is legacy SEV type then do upfront pinning. Else, skip up front pinning. 2. KVM: I'm not sure anything special needs to happen here. For the legacy VM types, it can be configured to use legacy memslots, presumably the same as non-CVMs will be configured. For the new VM type, it should be configured to use UPM. 3. Control plane (thing creating VMs): Responsible for not allowing legacy SEV images (i.e., images without KVM_FEATURE_HC_MAP_GPA_RANGE) with the new SEV VM types that use UPM and have support for demand pinning. Sean: Did I get this right?