From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 527C4C4320E for ; Wed, 1 Sep 2021 04:59:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3CE7861057 for ; Wed, 1 Sep 2021 04:59:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241900AbhIAFAP (ORCPT ); Wed, 1 Sep 2021 01:00:15 -0400 Received: from mail.kernel.org ([198.145.29.99]:59734 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230483AbhIAFAO (ORCPT ); Wed, 1 Sep 2021 01:00:14 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id B8A6F60232; Wed, 1 Sep 2021 04:59:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1630472357; bh=Q2FwKGKWoZWfAPBVP99Q/ekB3fP6VqumPb1wzqe6Qqg=; h=In-Reply-To:References:Date:From:To:Cc:Subject:From; b=aGjJdeyHej8wH/ivFp9q5d3pewYxLqHCVEY0OxeiCITPPfEuLUvgTAEYE65yriA6w +7r9UpoxP6eTD7lmhpsMCWe+1D8F4pltXzK09fsYFKEKminqOjvZijoiyC7pRxFAOV VlFR6OSBb8J1xQiM4FZaw6nXDCH2rnAkZkPvDUqKnpbueY6Fo7IRVkdOSR/Nj+IYDk OHijHdmeqBfKGtCb8uPq1OZgAAsQEc/iGgOBdtEkVr3x/czkbWpIe1CyD0Hq41Xth/ 5HQdqa2gB16hSlCxLCQIXWlT3dO1wO6UPRrK2JhFmK6GpqKFa2A6my2yz90VOLZm+t otkKZOo5bCuWA== Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailauth.nyi.internal (Postfix) with ESMTP id D4F6C27C0054; Wed, 1 Sep 2021 00:59:14 -0400 (EDT) Received: from imap2 ([10.202.2.52]) by compute6.internal (MEProxy); Wed, 01 Sep 2021 00:59:14 -0400 X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvtddruddvvddgkeelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepofgfggfkjghffffhvffutgesthdtredtreerjeenucfhrhhomhepfdetnhgu hicunfhuthhomhhirhhskhhifdcuoehluhhtoheskhgvrhhnvghlrdhorhhgqeenucggtf frrghtthgvrhhnpeegjefghfdtledvfeegfeelvedtgfevkeeugfekffdvveeffeetieeh ueetveekfeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhroh hmpegrnhguhidomhgvshhmthhprghuthhhphgvrhhsohhnrghlihhthidqudduiedukeeh ieefvddqvdeifeduieeitdekqdhluhhtoheppehkvghrnhgvlhdrohhrgheslhhinhhugi drlhhuthhordhush X-ME-Proxy: Received: by mailuser.nyi.internal (Postfix, from userid 501) id B3238A002E4; Wed, 1 Sep 2021 00:59:11 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.5.0-alpha0-1126-g6962059b07-fm-20210901.001-g6962059b Mime-Version: 1.0 Message-Id: In-Reply-To: <61ea53ce-2ba7-70cc-950d-ca128bcb29c5@redhat.com> References: <20210824005248.200037-1-seanjc@google.com> <307d385a-a263-276f-28eb-4bc8dd287e32@redhat.com> <61ea53ce-2ba7-70cc-950d-ca128bcb29c5@redhat.com> Date: Tue, 31 Aug 2021 21:58:50 -0700 From: "Andy Lutomirski" To: "David Hildenbrand" , "Sean Christopherson" Cc: "Paolo Bonzini" , "Vitaly Kuznetsov" , "Wanpeng Li" , "Jim Mattson" , "Joerg Roedel" , "kvm list" , "Linux Kernel Mailing List" , "Borislav Petkov" , "Andrew Morton" , "Joerg Roedel" , "Andi Kleen" , "David Rientjes" , "Vlastimil Babka" , "Tom Lendacky" , "Thomas Gleixner" , "Peter Zijlstra (Intel)" , "Ingo Molnar" , "Varad Gautam" , "Dario Faggioli" , "the arch/x86 maintainers" , linux-mm@kvack.org, linux-coco@lists.linux.dev, "Kirill A. Shutemov" , "Kirill A . Shutemov" , "Sathyanarayanan Kuppuswamy" , "Dave Hansen" , "Yu Zhang" Subject: =?UTF-8?Q?Re:_[RFC]_KVM:_mm:_fd-based_approach_for_supporting_KVM_guest_?= =?UTF-8?Q?private_memory?= Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Aug 31, 2021, at 12:07 PM, David Hildenbrand wrote: > On 28.08.21 00:18, Sean Christopherson wrote: > > On Thu, Aug 26, 2021, David Hildenbrand wrote: > >> You'll end up with a VMA that corresponds to the whole file in a single > >> process only, and that cannot vanish, not even in parts. > > > > How would userspace tell the kernel to free parts of memory that it doesn't want > > assigned to the guest, e.g. to free memory that the guest has converted to > > not-private? > > I'd guess one possibility could be fallocate(FALLOC_FL_PUNCH_HOLE). > > Questions are: when would it actually be allowed to perform such a > destructive operation? Do we have to protect from that? How would KVM > protect from user space replacing private pages by shared pages in any > of the models we discuss? > What do you mean? If userspace maliciously replaces a shared page by a private page, then the guest crashes. (The actual meaning here is a bit different on SNP-ES vs TDX. In SNP-ES, a given GPA can be shared, private, or nonexistent. A guest accesses it with a special bit set in the guest page tables to indicate whether it expects shared or private, and the CPU will produce an appropriate error if the bit doesn't match the page. In TDX, there is actually an entirely separate shared vs private address space, and, in theory, a given "GPA" can exist as shared and as private at once. The full guest n-bit GPA plus the shared/private bit is logically an N+1 bit address, and it's possible to map all of it at once, half shared, and half private. In practice, the defined guest->host APIs don't really support that usage. From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3EEB13FC2 for ; Wed, 1 Sep 2021 04:59:18 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id B8A6F60232; Wed, 1 Sep 2021 04:59:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1630472357; bh=Q2FwKGKWoZWfAPBVP99Q/ekB3fP6VqumPb1wzqe6Qqg=; h=In-Reply-To:References:Date:From:To:Cc:Subject:From; b=aGjJdeyHej8wH/ivFp9q5d3pewYxLqHCVEY0OxeiCITPPfEuLUvgTAEYE65yriA6w +7r9UpoxP6eTD7lmhpsMCWe+1D8F4pltXzK09fsYFKEKminqOjvZijoiyC7pRxFAOV VlFR6OSBb8J1xQiM4FZaw6nXDCH2rnAkZkPvDUqKnpbueY6Fo7IRVkdOSR/Nj+IYDk OHijHdmeqBfKGtCb8uPq1OZgAAsQEc/iGgOBdtEkVr3x/czkbWpIe1CyD0Hq41Xth/ 5HQdqa2gB16hSlCxLCQIXWlT3dO1wO6UPRrK2JhFmK6GpqKFa2A6my2yz90VOLZm+t otkKZOo5bCuWA== Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailauth.nyi.internal (Postfix) with ESMTP id D4F6C27C0054; Wed, 1 Sep 2021 00:59:14 -0400 (EDT) Received: from imap2 ([10.202.2.52]) by compute6.internal (MEProxy); Wed, 01 Sep 2021 00:59:14 -0400 X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvtddruddvvddgkeelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepofgfggfkjghffffhvffutgesthdtredtreerjeenucfhrhhomhepfdetnhgu hicunfhuthhomhhirhhskhhifdcuoehluhhtoheskhgvrhhnvghlrdhorhhgqeenucggtf frrghtthgvrhhnpeegjefghfdtledvfeegfeelvedtgfevkeeugfekffdvveeffeetieeh ueetveekfeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhroh hmpegrnhguhidomhgvshhmthhprghuthhhphgvrhhsohhnrghlihhthidqudduiedukeeh ieefvddqvdeifeduieeitdekqdhluhhtoheppehkvghrnhgvlhdrohhrgheslhhinhhugi drlhhuthhordhush X-ME-Proxy: Received: by mailuser.nyi.internal (Postfix, from userid 501) id B3238A002E4; Wed, 1 Sep 2021 00:59:11 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.5.0-alpha0-1126-g6962059b07-fm-20210901.001-g6962059b Precedence: bulk X-Mailing-List: linux-coco@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Message-Id: In-Reply-To: <61ea53ce-2ba7-70cc-950d-ca128bcb29c5@redhat.com> References: <20210824005248.200037-1-seanjc@google.com> <307d385a-a263-276f-28eb-4bc8dd287e32@redhat.com> <61ea53ce-2ba7-70cc-950d-ca128bcb29c5@redhat.com> Date: Tue, 31 Aug 2021 21:58:50 -0700 From: "Andy Lutomirski" To: "David Hildenbrand" , "Sean Christopherson" Cc: "Paolo Bonzini" , "Vitaly Kuznetsov" , "Wanpeng Li" , "Jim Mattson" , "Joerg Roedel" , "kvm list" , "Linux Kernel Mailing List" , "Borislav Petkov" , "Andrew Morton" , "Joerg Roedel" , "Andi Kleen" , "David Rientjes" , "Vlastimil Babka" , "Tom Lendacky" , "Thomas Gleixner" , "Peter Zijlstra (Intel)" , "Ingo Molnar" , "Varad Gautam" , "Dario Faggioli" , "the arch/x86 maintainers" , linux-mm@kvack.org, linux-coco@lists.linux.dev, "Kirill A. Shutemov" , "Kirill A . Shutemov" , "Sathyanarayanan Kuppuswamy" , "Dave Hansen" , "Yu Zhang" Subject: =?UTF-8?Q?Re:_[RFC]_KVM:_mm:_fd-based_approach_for_supporting_KVM_guest_?= =?UTF-8?Q?private_memory?= Content-Type: text/plain On Tue, Aug 31, 2021, at 12:07 PM, David Hildenbrand wrote: > On 28.08.21 00:18, Sean Christopherson wrote: > > On Thu, Aug 26, 2021, David Hildenbrand wrote: > >> You'll end up with a VMA that corresponds to the whole file in a single > >> process only, and that cannot vanish, not even in parts. > > > > How would userspace tell the kernel to free parts of memory that it doesn't want > > assigned to the guest, e.g. to free memory that the guest has converted to > > not-private? > > I'd guess one possibility could be fallocate(FALLOC_FL_PUNCH_HOLE). > > Questions are: when would it actually be allowed to perform such a > destructive operation? Do we have to protect from that? How would KVM > protect from user space replacing private pages by shared pages in any > of the models we discuss? > What do you mean? If userspace maliciously replaces a shared page by a private page, then the guest crashes. (The actual meaning here is a bit different on SNP-ES vs TDX. In SNP-ES, a given GPA can be shared, private, or nonexistent. A guest accesses it with a special bit set in the guest page tables to indicate whether it expects shared or private, and the CPU will produce an appropriate error if the bit doesn't match the page. In TDX, there is actually an entirely separate shared vs private address space, and, in theory, a given "GPA" can exist as shared and as private at once. The full guest n-bit GPA plus the shared/private bit is logically an N+1 bit address, and it's possible to map all of it at once, half shared, and half private. In practice, the defined guest->host APIs don't really support that usage.