From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.7 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5205BC433EF for ; Wed, 15 Sep 2021 13:51:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D5C8B61268 for ; Wed, 15 Sep 2021 13:51:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D5C8B61268 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 364BC940007; Wed, 15 Sep 2021 09:51:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 31416900002; Wed, 15 Sep 2021 09:51:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1DC82940007; Wed, 15 Sep 2021 09:51:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0018.hostedemail.com [216.40.44.18]) by kanga.kvack.org (Postfix) with ESMTP id 10A67900002 for ; Wed, 15 Sep 2021 09:51:32 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id B74C88249980 for ; Wed, 15 Sep 2021 13:51:31 +0000 (UTC) X-FDA: 78589945182.21.FD3A700 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf07.hostedemail.com (Postfix) with ESMTP id 421D71000098 for ; Wed, 15 Sep 2021 13:51:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1631713890; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rftZ+5MAkOWpGqIg+qkRIqNakYxlTPCCATzwPiV0xWY=; b=iy8aQheQeyZukpAcQNMlwVGaTYDQTweIhvwy6lxXPorJfYo5bt08Eope1jVa5DJYADkA1k CAo5mzbbU0GmRWFF8baGitSCuxAwXiUdSTCh/TsQp2x1SSvyqlpyZyPvv8faCiWD6wfilj Q0/1Xgji39tWxxyqv8+xivHaV1g7v18= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-321-ntYTmGEQNrGewbgj6wdhaw-1; Wed, 15 Sep 2021 09:51:29 -0400 X-MC-Unique: ntYTmGEQNrGewbgj6wdhaw-1 Received: by mail-wm1-f71.google.com with SMTP id 17-20020a05600c029100b00305eac9f29aso558281wmk.1 for ; Wed, 15 Sep 2021 06:51:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:to:cc:references:from:organization:subject :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=rftZ+5MAkOWpGqIg+qkRIqNakYxlTPCCATzwPiV0xWY=; b=LdC2sa4xOYa/M0RxOoahxlPFPFpLLcxy/1+E0OprdgnODcsLUy6WUtrL4foCixbhuk 6jlIR3ysoi6juOZ4JoAQ+LVY2aOvsdTPPWFygh928d0cbpprR6EFOgcQA5Y7tGaHPx21 mSeg6+mZdErNz1jk0Kvb5/Q4GoLKymPrw6dRrc73to0Dl6cZuLPlIkLq+7FUJmkOIPve opbasRbJ363dx/lHEFSEzKBKTQwWRCf1bo/ehbpRFnQ7es8m6V84OKnUndyEo7mwj/NB NmlJ773viP9dXdBrXQlRHtH4WIJFUmcMA3hSMnIKkhN+hyUwQngTstu6lx9qqi8HO4QO +BwA== X-Gm-Message-State: AOAM530Sg2mXFlWNhQlonIc19xj9DckwAysLXdta4kVI8XpHr9nYmEBv k8DArIIoN/6qUk/NFKXDKyk3x4uDOQutAsYUOFjqgnoElC1SKK64SjKeH1ornQs5bZ+zmXwPHTt H4PE3yJ/MuEI= X-Received: by 2002:adf:f80e:: with SMTP id s14mr5214968wrp.435.1631713888034; Wed, 15 Sep 2021 06:51:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwzoCluIVhqAjakmpKSc6PYwGiKjxORx84eAMcvC6SnZF/I8Yda43+I/iYU0yIbeHiK6R7lTA== X-Received: by 2002:adf:f80e:: with SMTP id s14mr5214922wrp.435.1631713887745; Wed, 15 Sep 2021 06:51:27 -0700 (PDT) Received: from [192.168.3.132] (p5b0c6426.dip0.t-ipconnect.de. [91.12.100.38]) by smtp.gmail.com with ESMTPSA id q11sm29856wrn.65.2021.09.15.06.51.25 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 15 Sep 2021 06:51:27 -0700 (PDT) To: Chao Peng , "Kirill A. Shutemov" Cc: Andy Lutomirski , Sean Christopherson , Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Borislav Petkov , Andrew Morton , Joerg Roedel , Andi Kleen , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Ingo Molnar , Varad Gautam , Dario Faggioli , x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, "Kirill A . Shutemov" , Kuppuswamy Sathyanarayanan , Dave Hansen , Yu Zhang References: <20210824005248.200037-1-seanjc@google.com> <20210902184711.7v65p5lwhpr2pvk7@box.shutemov.name> <20210903191414.g7tfzsbzc7tpkx37@box.shutemov.name> <02806f62-8820-d5f9-779c-15c0e9cd0e85@kernel.org> <20210910171811.xl3lms6xoj3kx223@box.shutemov.name> <20210915195857.GA52522@chaop.bj.intel.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [RFC] KVM: mm: fd-based approach for supporting KVM guest private memory Message-ID: <51a6f74f-6c05-74b9-3fd7-b7cd900fb8cc@redhat.com> Date: Wed, 15 Sep 2021 15:51:25 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210915195857.GA52522@chaop.bj.intel.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=gbk; format=flowed Content-Language: en-US X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 421D71000098 X-Stat-Signature: sotzgw8h5tuy8hxtd74nbe4eg9gocsm3 Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=iy8aQheQ; spf=none (imf07.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-HE-Tag: 1631713891-374057 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: >> diff --git a/mm/memfd.c b/mm/memfd.c >> index 081dd33e6a61..ae43454789f4 100644 >> --- a/mm/memfd.c >> +++ b/mm/memfd.c >> @@ -130,11 +130,24 @@ static unsigned int *memfd_file_seals_ptr(struct= file *file) >> return NULL; >> } >> =20 >> +int memfd_register_guest(struct inode *inode, void *owner, >> + const struct guest_ops *guest_ops, >> + const struct guest_mem_ops **guest_mem_ops) >> +{ >> + if (shmem_mapping(inode->i_mapping)) { >> + return shmem_register_guest(inode, owner, >> + guest_ops, guest_mem_ops); >> + } >> + >> + return -EINVAL; >> +} >=20 > Are we stick our design to memfd interface (e.g other memory backing > stores like tmpfs and hugetlbfs will all rely on this memfd interface t= o > interact with KVM), or this is just the initial implementation for PoC? I don't think we are, it still feels like we are in the early prototype=20 phase (even way before a PoC). I'd be happy to see something "cleaner"=20 so to say -- it still feels kind of hacky to me, especially there seem=20 to be many pieces of the big puzzle missing so far. Unfortunately, this=20 series hasn't caught the attention of many -MM people so far, maybe=20 because other people miss the big picture as well and are waiting for a=20 complete design proposal. For example, what's unclear to me: we'll be allocating pages with=20 GFP_HIGHUSER_MOVABLE, making them land on MIGRATE_CMA or ZONE_MOVABLE;=20 then we silently turn them unmovable, which breaks these concepts. Who'd=20 migrate these pages away just like when doing long-term pinning, or how=20 is that supposed to work? Also unclear to me is how refcount and mapcount will be handled to=20 prevent swapping, who will actually do some kind of gfn-epfn etc.=20 mapping, how we'll forbid access to this memory e.g., via /proc/kcore or=20 when dumping memory ... and how it would ever work with=20 migration/swapping/rmap (it's clearly future work, but it's been raised=20 that this would be the way to make it work, I don't quite see how it=20 would all come together). Last but not least, I raised to Intel via a different channel that I'd=20 appreciate updated hardware that avoids essentially crashing the=20 hypervisor when writing to encrypted memory from user space. It has the=20 smell of "broken hardware" to it that might just be fixed by a new=20 hardware generation to make it look more similar to other successful=20 implementations of secure/encrypted memory. That might it much easier to=20 support an initial version of TDX -- instead of having to reinvent the=20 way we map guest memory just now to support hardware that might sort out=20 the root problem later. Having that said, there might be benefits to mapping guest memory=20 differently, but my gut feeling is that it might take quite a long time=20 to get something reasonable working, to settle on a design, and to get=20 it accepted by all involved parties to merge it upstream. Just my 2 cents, I might be all wrong as so often. <\note> --=20 Thanks, David / dhildenb