From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 033162C80 for ; Fri, 12 Nov 2021 19:48:21 +0000 (UTC) Received: by mail-pl1-f174.google.com with SMTP id y8so3795251plg.1 for ; Fri, 12 Nov 2021 11:48:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=zr5FgjktMhow+NOFcAQVViEc+ANJI2M0r6r48QGgVI4=; b=ju07XZYoPBdZerGoypJVNunlreP0szj3eP3wBUKtjj3PgeSmreG3NLWofDNDwV6K29 FsSfNhHPjnLQwrYJbcInKVbPImCmeaSbNg1PrGgCt/txR88PNm1nU0RTbPjM3JdC3wf3 uZ85zCueTnCAB+k9Pw+WeIIKYWLJn2vZxUaWaMXx1CvRCQcsZjqW9kwUlCANJLD9YPbf TOq6qB/AcWNguhQgpAU6PKmtZwvGrzVt2V5i98njwxtUv/Qc7NCreMhve1otMefuO7y8 W2WfbjoI5XZ/n45uufMVitUf7PZW+0Okf36r5AIHuD1vpBBTL6m1a0mtY/OWfjjaOpfP oGnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=zr5FgjktMhow+NOFcAQVViEc+ANJI2M0r6r48QGgVI4=; b=ZKe1tTKXPT6/rqtPT6kXRbc/1J3Ve4F45OaXCq2zKIF5+LPQzveth0fJnuGlBKUZG8 lqJeyIgQ/07okDMgAP5PP/SdOv0Q5zdY5irpNKAGncVZb37oyPzXb4mdOSxa7bqWUgQB bSLDcQ0cI6e9NFAycMhe6zUQlWNRLfreB+ZoDI45RuBfY2ThmtE4VIB1r26dMPl0tCMx FaMKil/EdhojCDftrU6okfWIVJPrKZMp+I8JzByzJV9+CsACSyLXZ/JS1ahlenQMo3Ny YEn6r68LpRK3Ckg1v40nVPel6jBbNIPQiDkmRTT6t7cBK2tNuFkQZdhcpQ4u/dttRZG7 iRaQ== X-Gm-Message-State: AOAM533QmaYXliJrIOuL/9v0Bg1v6Y/vyFQT7bH/Ne6oCAB6vUjgoWNO P7TL5ut4u1ghzftShHdsqt9LrA== X-Google-Smtp-Source: ABdhPJy4Qx6b+7+ByFpOdO4v0MJjsrHOwn6ocVpZeRM8XbBpXpi+sLPnEs6oezsNOzfZQuqOUI4/kw== X-Received: by 2002:a17:90b:1b07:: with SMTP id nu7mr21351128pjb.140.1636746501253; Fri, 12 Nov 2021 11:48:21 -0800 (PST) Received: from google.com (157.214.185.35.bc.googleusercontent.com. [35.185.214.157]) by smtp.gmail.com with ESMTPSA id y6sm7644847pfi.154.2021.11.12.11.48.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 12 Nov 2021 11:48:20 -0800 (PST) Date: Fri, 12 Nov 2021 19:48:17 +0000 From: Sean Christopherson To: Borislav Petkov Cc: Dave Hansen , Peter Gonda , Brijesh Singh , x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-coco@lists.linux.dev, linux-mm@kvack.org, linux-crypto@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Joerg Roedel , Tom Lendacky , "H. Peter Anvin" , Ard Biesheuvel , Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Andy Lutomirski , Dave Hansen , Sergio Lopez , Peter Zijlstra , Srinivas Pandruvada , David Rientjes , Dov Murik , Tobin Feldman-Fitzthum , Michael Roth , Vlastimil Babka , "Kirill A . Shutemov" , Andi Kleen , tony.luck@intel.com, marcorr@google.com, sathyanarayanan.kuppuswamy@linux.intel.com Subject: Re: [PATCH Part2 v5 00/45] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Message-ID: References: <20210820155918.7518-1-brijesh.singh@amd.com> <061ccd49-3b9f-d603-bafd-61a067c3f6fa@intel.com> Precedence: bulk X-Mailing-List: linux-coco@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: On Fri, Nov 12, 2021, Borislav Petkov wrote: > On Fri, Nov 12, 2021 at 09:59:46AM -0800, Dave Hansen wrote: > > Or, is there some mechanism that prevent guest-private memory from being > > accessed in random host kernel code? Or random host userspace code... > So I'm currently under the impression that random host->guest accesses > should not happen if not previously agreed upon by both. Key word "should". > Because, as explained on IRC, if host touches a private guest page, > whatever the host does to that page, the next time the guest runs, it'll > get a #VC where it will see that that page doesn't belong to it anymore > and then, out of paranoia, it will simply terminate to protect itself. > > So cloud providers should have an interest to prevent such random stray > accesses if they wanna have guests. :) Yes, but IMO inducing a fault in the guest because of _host_ bug is wrong. On Fri, Nov 12, 2021, Peter Gonda wrote: > Here is an alternative to the current approach: On RMP violation (host > or userspace) the page fault handler converts the page from private to > shared to allow the write to continue. This pulls from s390’s error > handling which does exactly this. See ‘arch_make_page_accessible()’. Ah, after further reading, s390 does _not_ do implicit private=>shared conversions. s390's arch_make_page_accessible() is somewhat similar, but it is not a direct comparison. IIUC, it exports and integrity protects the data and thus preserves the guest's data in an encrypted form, e.g. so that it can be swapped to disk. And if the host corrupts the data, attempting to convert it back to secure on a subsequent guest access will fail. The host kernel's handling of the "convert to secure" failures doesn't appear to be all that robust, e.g. it looks like there are multiple paths where the error is dropped on the floor and the guest is resumed , but IMO soft hanging the guest is still better than inducing a fault in the guest, and far better than potentially coercing the guest into reading corrupted memory ("spurious" PVALIDATE). And s390's behavior is fixable since it's purely a host error handling problem. To truly make a page shared, s390 requires the guest to call into the ultravisor to make a page shared. And on the host side, the host can pin a page as shared to prevent the guest from unsharing it while the host is accessing it as a shared page. So, inducing #VC is similar in the sense that a malicious s390 can also DoS itself, but is quite different in that (AFAICT) s390 does not create an attack surface where a malicious or buggy host userspace can induce faults in the guest, or worst case in SNP, exploit a buggy guest into accepting and accessing corrupted data. It's also different in that s390 doesn't implicitly convert between shared and private. Functionally, it doesn't really change the end result because a buggy host that writes guest private memory will DoS the guest (by inducing a #VC or corrupting exported data), but at least for s390 there's a sane, legitimate use case for accessing guest private memory (swap and maybe migration?), whereas for SNP, IMO implicitly converting to shared on a host access is straight up wrong. > Additionally it adds less complexity to the SNP kernel patches, and > requires no new ABI. I disagree, this would require "new" ABI in the sense that it commits KVM to supporting SNP without requiring userspace to initiate any and all conversions between shared and private. Which in my mind is the big elephant in the room: do we want to require new KVM (and kernel?) ABI to allow/force userspace to explicitly declare guest private memory for TDX _and_ SNP, or just TDX?