From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.7 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F1CDC433F5 for ; Wed, 15 Sep 2021 14:59:54 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B48E161261 for ; Wed, 15 Sep 2021 14:59:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B48E161261 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 479D16B0072; Wed, 15 Sep 2021 10:59:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 402586B0073; Wed, 15 Sep 2021 10:59:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2A2C16B0074; Wed, 15 Sep 2021 10:59:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0244.hostedemail.com [216.40.44.244]) by kanga.kvack.org (Postfix) with ESMTP id 17E146B0072 for ; Wed, 15 Sep 2021 10:59:53 -0400 (EDT) Received: from smtpin32.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id C4AF2824C44C for ; Wed, 15 Sep 2021 14:59:52 +0000 (UTC) X-FDA: 78590117424.32.874C78E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf19.hostedemail.com (Postfix) with ESMTP id 7D4CAB0000A5 for ; Wed, 15 Sep 2021 14:59:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1631717992; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sXIxfXAeAXwqynGXNUuaFoUXwg+GC8ruNUgtA4V3/Js=; b=cAJJ4EXJWHZ9vCSqEDznTD6L9EbkomwKOpYqqLD+hy9vLfGW/UhmGAz/tWGnWahYft7Qma Djk0dS2NRUuD4D/Qt1OgoJnyApnA0x58s8N5wQB79HWkIIrrztU2ispfOFTr/M/C8Jr3tl hfbSBKfE8Bm+e0M9Zoi/mOhd4FvpGfA= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-40-kfgfnH77M6aJhzM-MbgMUQ-1; Wed, 15 Sep 2021 10:59:51 -0400 X-MC-Unique: kfgfnH77M6aJhzM-MbgMUQ-1 Received: by mail-wm1-f72.google.com with SMTP id a144-20020a1c7f96000000b002fee1aceb6dso1608640wmd.0 for ; Wed, 15 Sep 2021 07:59:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:to:cc:references:from:organization:subject :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=sXIxfXAeAXwqynGXNUuaFoUXwg+GC8ruNUgtA4V3/Js=; b=S0hDMzJvAwUIRUuGKBuJIDimwzoTiuEvggFkpfyl8LNxz3p53teKj2IcPb8BF9Njq+ s+KJ101HtcOxgFl0mL058WVll3GIOYvYTZhH8UreOzCdfz7uc2IIXEEX3ETSmDwrgnBk 9CsXk1xS/nX0z3sS3oV3ScRL//GfF70PSBGL2iCUugaATMmKXzWcWEAZDn6xuFX3vTvw AM4Ip6waquvx5DnxzObOCJcC1rApexCNsLe4eR4JGpuoRweVOQD8zRlX2BY1J/KZKkwj hi1HYEs/vXasQ+ligM/dywB6tnYsqLit9a4Ien/qWWDc0hxvh3hDnWptTw+NXque7AOf B1aA== X-Gm-Message-State: AOAM533arltsWJc6vV5pQYUl5MP4hTeZ7ZESM6d6xrHpX9MKMchfXG6e LReyTwNrxenDG8L7y8zwiarZ/d9+cp9QRLbpVIQeCOnFaKF0i97jykQ78S+jdYBglY4dbJW49G+ B4kX2t5wHP5M= X-Received: by 2002:a5d:630a:: with SMTP id i10mr443970wru.178.1631717989725; Wed, 15 Sep 2021 07:59:49 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx1F2FJo1VqmKuCfy3iFBKkDaaKnmp1KJn/dP3v7qU1ZNwTLAenO3hU3flHn7Ro2t9lWiPT1w== X-Received: by 2002:a5d:630a:: with SMTP id i10mr443929wru.178.1631717989498; Wed, 15 Sep 2021 07:59:49 -0700 (PDT) Received: from [192.168.3.132] (p5b0c6426.dip0.t-ipconnect.de. [91.12.100.38]) by smtp.gmail.com with ESMTPSA id r129sm204926wmr.7.2021.09.15.07.59.47 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 15 Sep 2021 07:59:48 -0700 (PDT) To: "Kirill A. Shutemov" Cc: Chao Peng , "Kirill A. Shutemov" , Andy Lutomirski , Sean Christopherson , Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Borislav Petkov , Andrew Morton , Joerg Roedel , Andi Kleen , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Ingo Molnar , Varad Gautam , Dario Faggioli , x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, Kuppuswamy Sathyanarayanan , Dave Hansen , Yu Zhang References: <20210824005248.200037-1-seanjc@google.com> <20210902184711.7v65p5lwhpr2pvk7@box.shutemov.name> <20210903191414.g7tfzsbzc7tpkx37@box.shutemov.name> <02806f62-8820-d5f9-779c-15c0e9cd0e85@kernel.org> <20210910171811.xl3lms6xoj3kx223@box.shutemov.name> <20210915195857.GA52522@chaop.bj.intel.com> <51a6f74f-6c05-74b9-3fd7-b7cd900fb8cc@redhat.com> <20210915142921.bxxsap6xktkt4bek@black.fi.intel.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [RFC] KVM: mm: fd-based approach for supporting KVM guest private memory Message-ID: Date: Wed, 15 Sep 2021 16:59:46 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210915142921.bxxsap6xktkt4bek@black.fi.intel.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 7D4CAB0000A5 Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cAJJ4EXJ; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf19.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com X-Stat-Signature: w3y8mhcqqg965pk9bodgq9d97ktg4mq7 X-HE-Tag: 1631717992-695375 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: >> I don't think we are, it still feels like we are in the early prototyp= e >> phase (even way before a PoC). I'd be happy to see something "cleaner"= so to >> say -- it still feels kind of hacky to me, especially there seem to be= many >> pieces of the big puzzle missing so far. Unfortunately, this series ha= sn't >> caught the attention of many -MM people so far, maybe because other pe= ople >> miss the big picture as well and are waiting for a complete design pro= posal. >> >> For example, what's unclear to me: we'll be allocating pages with >> GFP_HIGHUSER_MOVABLE, making them land on MIGRATE_CMA or ZONE_MOVABLE;= then >> we silently turn them unmovable, which breaks these concepts. Who'd mi= grate >> these pages away just like when doing long-term pinning, or how is tha= t >> supposed to work? >=20 > That's fair point. We can fix it by changing mapping->gfp_mask. That's essentially what secretmem does when setting up a file. >=20 >> Also unclear to me is how refcount and mapcount will be handled to pre= vent >> swapping, >=20 > refcount and mapcount are unchanged. Pages not pinned per se. Swapping > prevented with the change in shmem_writepage(). So when mapping into the guest, we'd increment the refcount but not the=20 mapcount I assume? >=20 >> who will actually do some kind of gfn-epfn etc. mapping, how we'll >> forbid access to this memory e.g., via /proc/kcore or when dumping mem= ory >=20 > It's not aimed to prevent root to shoot into his leg. Root do root. IMHO being root is not an excuse to read some random file (actually used=20 in production environments) to result in the machine crashing. Not=20 acceptable for distributions. I'm still missing the whole gfn-epfn 1:1 mapping discussion we=20 identified as requirements. Is that supposed to be done by KVM? How? >=20 >> ... and how it would ever work with migration/swapping/rmap (it's clea= rly >> future work, but it's been raised that this would be the way to make i= t >> work, I don't quite see how it would all come together). >=20 > Given that hardware supports it migration and swapping can be implement= ed > by providing new callbacks in guest_ops. Like ->migrate_page would > transfer encrypted data between pages and ->swapout would provide > encrypted blob that can be put on disk or handled back to ->swapin to > bring back to memory. Again, I'm missing the complete picture. To make swapping decisions=20 vmscan code needs track+handle dirty+reference information. How would we=20 be able to track references? Does the hardware allow for temporary=20 unmapping of encrypted memory and faulting on it? How would=20 page_referenced() continue working? "we can add callbacks" is not a=20 satisfying answer, at least for me. Especially, when it comes to=20 eventual locking problems and races. Maybe saying "migration+swap is not supported" is clearer than "we can=20 add callbacks" and missing some details on the bigger picture. Again, a complete design proposal would be highly valuable, especially=20 to get some more review from other -MM folks. Otherwise there is a high=20 chance that this will be rejected late when trying to upstream and -MM=20 people stumbling over it (we've had some similar thing happening just=20 recently unfortunately ...). --=20 Thanks, David / dhildenb