From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5609C19F28 for ; Wed, 3 Aug 2022 15:51:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4291B8E0001; Wed, 3 Aug 2022 11:51:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3B1C76B0073; Wed, 3 Aug 2022 11:51:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 22BDA8E0001; Wed, 3 Aug 2022 11:51:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 0D8C66B0072 for ; Wed, 3 Aug 2022 11:51:31 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id CC4431211C2 for ; Wed, 3 Aug 2022 15:51:30 +0000 (UTC) X-FDA: 79758721140.06.458B2CE Received: from mail-pg1-f176.google.com (mail-pg1-f176.google.com [209.85.215.176]) by imf22.hostedemail.com (Postfix) with ESMTP id 655E8C0047 for ; Wed, 3 Aug 2022 15:51:30 +0000 (UTC) Received: by mail-pg1-f176.google.com with SMTP id r186so15471870pgr.2 for ; Wed, 03 Aug 2022 08:51:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc; bh=nQ/PVBdgsWcgdstxLD2wNHSv0oGBQkwDPYWARWt/BZk=; b=UqknoREV8mEy68zQSBU2lbsWwGYJTvDQqRAleSTHvyIJ9duv+KK5mptQRpjJ4+gqwT NxHvbORsCprsREjtYZh1buxFH+7EmVZpQLTV3+hYE+0dFCTXjJEnC3b96X8evdOb5AvE k4iA+m9355bZUR/pB9mbPpAtkahMb5UJRUeDN+StvBc9wiJ05Fv5gikb+f1UBTkXZ9Wb T3uxjZMlPupTU7GmMz1t23PVXrEuoowTqTk0zL5Hyru097c+kpaUrTX0+tZJiyZrRiOO fkDHAGmtMrFQ7XABTAs6+c7PxqPyw3Ku9zCqwf6gSHK8zwJF3t1nJjjVSt++BhLBcJPe c9cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc; bh=nQ/PVBdgsWcgdstxLD2wNHSv0oGBQkwDPYWARWt/BZk=; b=A+kK3QbKjRy+fbOFbbqIPJ9ATNoaZyAuom5jxg6DBFv3N5Sl1TfmljGVgSaucZfFWX L9+8J7HVNv3cXPRvJFkGsil97ahTV8aG/WpwHPbaADnSxliNtJTCLLV7EDMIe/GJa8fy rveqBIq0YuYuDnZ4NDNyUROE/wHXXi/PjRnVYyk/As54ntP5JZ+dAxs0ieRabmZctFpr UyfFEw3EJgZffKRQ9B0MBdehFQ2FI2OtTBzxDh0jY+tUVMMZTUF3zgLWT/WE+N0dusHu phao4gDghMHbr3NH3Dr/SfF0pKEotUJtRwobJ9ykOOhBX9/2fMtmpGcX6VJO/P6lj9cK XwYA== X-Gm-Message-State: AJIora+6LgJySwKd7ir6zGVjfj6N4F8B59VN2PHoUKWWiE4VKdFi+Xy5 FPRL3xE0KtBGve6QZO2o5T3ieQ== X-Google-Smtp-Source: AGRyM1uDNFXI+kxeHIg+ULHnKLrtLVM3YAsuBREDsfvLxR8PwpGaMsjug1I+UmNiUr91nJEzTB4ZRg== X-Received: by 2002:a63:d014:0:b0:41a:13b3:69d9 with SMTP id z20-20020a63d014000000b0041a13b369d9mr21261764pgf.202.1659541889184; Wed, 03 Aug 2022 08:51:29 -0700 (PDT) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id x187-20020a6231c4000000b0052c4b3e6f6asm13535782pfx.97.2022.08.03.08.51.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Aug 2022 08:51:28 -0700 (PDT) Date: Wed, 3 Aug 2022 15:51:24 +0000 From: Sean Christopherson To: Chao Peng Cc: Wei Wang , "Gupta, Pankaj" , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, qemu-devel@nongnu.org, linux-kselftest@vger.kernel.org, Paolo Bonzini , Jonathan Corbet , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Shuah Khan , Mike Rapoport , Steven Price , "Maciej S . Szmigiero" , Vlastimil Babka , Vishal Annapurve , Yu Zhang , "Kirill A . Shutemov" , luto@kernel.org, jun.nakajima@intel.com, dave.hansen@intel.com, ak@linux.intel.com, david@redhat.com, aarcange@redhat.com, ddutile@redhat.com, dhildenb@redhat.com, Quentin Perret , Michael Roth , mhocko@suse.com, Muchun Song Subject: Re: [PATCH v7 11/14] KVM: Register/unregister the guest private memory regions Message-ID: References: <45ae9f57-d595-f202-abb5-26a03a2ca131@linux.intel.com> <20220721092906.GA153288@chaop.bj.intel.com> <20220725130417.GA304216@chaop.bj.intel.com> <20220803094827.GA607465@chaop.bj.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220803094827.GA607465@chaop.bj.intel.com> ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1659541890; a=rsa-sha256; cv=none; b=vWqVp75eGRMrnIPjZG3BUj9Wa2KJP+ju0n50y00sYBpAtiFnoWT00wSfp6KMcsOJs3UcyB pa219X8SSUsY3K/ze0qinHdn7AlnasP7NNuiQ/K2bUayGCidYywfgV26AkHnGLSQXdRNIr l5wnZojcDHGBrpc1U2RMFhM9hdj5eow= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=UqknoREV; spf=pass (imf22.hostedemail.com: domain of seanjc@google.com designates 209.85.215.176 as permitted sender) smtp.mailfrom=seanjc@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1659541890; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nQ/PVBdgsWcgdstxLD2wNHSv0oGBQkwDPYWARWt/BZk=; b=J/eO+daQoNvJ2H6FwTRd+QW+KkxsJH2q0UyDHfykpNILbOEmbCnAQLNG2YK9faj4byYZkg nMuOF5S31NGsn4TEgp9hgqqSVesvOAWpXK0qfv0wf7e1SKX+03pxHGGwwcLOHNXdBF0XIL uSSSXpjMHYeIymPFsGi90PSlBtH2nEk= X-Rspam-User: X-Stat-Signature: 4d5axjazzcsymt3ep9163cyc54qp4o7i X-Rspamd-Queue-Id: 655E8C0047 Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=UqknoREV; spf=pass (imf22.hostedemail.com: domain of seanjc@google.com designates 209.85.215.176 as permitted sender) smtp.mailfrom=seanjc@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam02 X-HE-Tag: 1659541890-405188 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Aug 03, 2022, Chao Peng wrote: > On Tue, Aug 02, 2022 at 04:38:55PM +0000, Sean Christopherson wrote: > > On Tue, Aug 02, 2022, Sean Christopherson wrote: > > > I think we should avoid UNMAPPABLE even on the KVM side of things for the core > > > memslots functionality and instead be very literal, e.g. > > > > > > KVM_HAS_FD_BASED_MEMSLOTS > > > KVM_MEM_FD_VALID > > > > > > We'll still need KVM_HAS_USER_UNMAPPABLE_MEMORY, but it won't be tied directly to > > > the memslot. Decoupling the two thingis will require a bit of extra work, but the > > > code impact should be quite small, e.g. explicitly query and propagate > > > MEMFILE_F_USER_INACCESSIBLE to kvm_memory_slot to track if a memslot can be private. > > > And unless I'm missing something, it won't require an additional memslot flag. > > > The biggest oddity (if we don't also add KVM_MEM_PRIVATE) is that KVM would > > > effectively ignore the hva for fd-based memslots for VM types that don't support > > > private memory, i.e. userspace can't opt out of using the fd-based backing, but that > > > doesn't seem like a deal breaker. > > I actually love this idea. I don't mind adding extra code for potential > usage other than confidential VMs if we can have a workable solution for > it. > > > > > Hrm, but basing private memory on top of a generic FD_VALID would effectively require > > shared memory to use hva-based memslots for confidential VMs. That'd yield a very > > weird API, e.g. non-confidential VMs could be backed entirely by fd-based memslots, > > but confidential VMs would be forced to use hva-based memslots. > > It would work if we can treat userspace_addr as optional for > KVM_MEM_FD_VALID, e.g. userspace can opt in to decide whether needing > the mappable part or not for a regular VM and we can enforce KVM for > confidential VMs. But the u64 type of userspace_addr doesn't allow us to > express a 'null' value so sounds like we will end up needing another > flag anyway. > > In concept, we could have three cofigurations here: > 1. hva-only: without any flag and use userspace_addr; > 2. fd-only: another new flag is needed and use fd/offset; > 3. hva/fd mixed: both userspace_addr and fd/offset is effective. > KVM_MEM_PRIVATE is a subset of it for confidential VMs. Not sure > regular VM also wants this. My mental model breaks things down slightly differently, though the end result is more or less the same. After this series, there will be two types of memory: private and "regular" (I'm trying to avoid "shared"). "Regular" memory is always hva-based (userspace_addr), and private always fd-based (fd+offset). In the future, if we want to support fd-based memory for "regular" memory, then as you said we'd need to add a new flag, and a new fd+offset pair. At that point, we'd have two new (relatively to current) flags: KVM_MEM_PRIVATE_FD_VALID KVM_MEM_FD_VALID along with two new pairs of fd+offset (private_* and "regular"). Mapping those to your above list: 1. Neither *_FD_VALID flag set. 2a. Both PRIVATE_FD_VALID and FD_VALID are set 2b. FD_VALID is set and the VM doesn't support private memory 3. Only PRIVATE_FD_VALID is set (which private memory support in the VM). Thus, "regular" VMs can't have a mix in a single memslot because they can't use private memory. > There is no direct relationship between unmappable and fd-based since > even fd-based can also be mappable for regular VM? Yep. > > Ignore this idea for now. If there's an actual use case for generic fd-based memory > > then we'll want a separate flag, fd, and offset, i.e. that support could be added > > independent of KVM_MEM_PRIVATE. > > If we ignore this idea now (which I'm also fine), do you still think we > need change KVM_MEM_PRIVATE to KVM_MEM_USER_UNMAPPBLE? Hmm, no. After working through this, I think it's safe to say KVM_MEM_USER_UNMAPPABLE is bad name because we could end up with "regular" memory that's backed by an inaccessible (unmappable) file. One alternative would be to call it KVM_MEM_PROTECTED. That shouldn't cause problems for the known use of "private" (TDX and SNP), and it gives us a little wiggle room, e.g. if we ever get a use case where VMs can share memory that is otherwise protected. That's a pretty big "if" though, and odds are good we'd need more memslot flags and fd+offset pairs to allow differentiating "private" vs. "protected-shared" without forcing userspace to punch holes in memslots, so I don't know that hedging now will buy us anything. So I'd say that if people think KVM_MEM_PRIVATE brings additional and meaningful clarity over KVM_MEM_PROTECTECD, then lets go with PRIVATE. But if PROTECTED is just as good, go with PROTECTED as it gives us a wee bit of wiggle room for the future. Note, regardless of what name we settle on, I think it makes to do the KVM_PRIVATE_MEM_SLOTS => KVM_INTERNAL_MEM_SLOTS rename.