From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A8D5ECAAA1 for ; Fri, 9 Sep 2022 19:11:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 376238D0003; Fri, 9 Sep 2022 15:11:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 325AD8D0002; Fri, 9 Sep 2022 15:11:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 19F4C8D0003; Fri, 9 Sep 2022 15:11:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 0CC2B8D0002 for ; Fri, 9 Sep 2022 15:11:35 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id DB1581604FC for ; Fri, 9 Sep 2022 19:11:34 +0000 (UTC) X-FDA: 79893490908.22.1E37B09 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf29.hostedemail.com (Postfix) with ESMTP id 56E241200BC for ; Fri, 9 Sep 2022 19:11:34 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 68C1CB82191; Fri, 9 Sep 2022 19:11:32 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8E41AC433B5; Fri, 9 Sep 2022 19:11:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1662750691; bh=SUgnHRqE57J9ijihjzghns114gPu46lDz6ObxsACSls=; h=In-Reply-To:References:Date:From:To:Cc:Subject:From; b=NnfCE4rlcwmQQcpvB+mhRsBr0rzh50pyIUa4C+etmVe/qS55Vw83aYPCuRPWZtzki VnvajzmJY7ZR7EuaY6N+IoPq2Ux1o/Ntp4XCNReCeio0JfKfbwK5kjBqN97ig3lzFE 5IRjrFDxqZsbnqGY1vJbpHvetd9hy91Taa8EgM/g5ZzEOey/2i4qd604I3vfmYiN/g yLjKYPLOzCZ8CLL8ftW1R8Jol+bEF/uuGfKEuMbRRqILRNu/0MKwLKxOVKnpthPIqV DEwm6ntnNqxodg49PF9VGVU+rZd+rNC9xwcdBd7zrFxI/W/sqW6obpMP3iOXzcnMed EQEwNup4Q488g== Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailauth.nyi.internal (Postfix) with ESMTP id 553A027C005B; Fri, 9 Sep 2022 15:11:28 -0400 (EDT) Received: from imap48 ([10.202.2.98]) by compute2.internal (MEProxy); Fri, 09 Sep 2022 15:11:28 -0400 X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvfedrfedthedgudefjecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefofgggkfgjfhffhffvvefutgesthdtredtreertdenucfhrhhomhepfdet nhguhicunfhuthhomhhirhhskhhifdcuoehluhhtoheskhgvrhhnvghlrdhorhhgqeenuc ggtffrrghtthgvrhhnpeekuddthfelkeegtdelteeuieevkeegudduheevtdetieegheet ffelleduvddtueenucffohhmrghinhepihhnthgvlhdrtghomhdpmhgvmhdrphgrghgvne cuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomheprghnugih odhmvghsmhhtphgruhhthhhpvghrshhonhgrlhhithihqdduudeiudekheeifedvqddvie efudeiiedtkedqlhhuthhopeepkhgvrhhnvghlrdhorhhgsehlihhnuhigrdhluhhtohdr uhhs X-ME-Proxy: Feedback-ID: ieff94742:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id 4802E31A0062; Fri, 9 Sep 2022 15:11:25 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.7.0-alpha0-927-gf4c98c8499-fm-20220826.002-gf4c98c84 Mime-Version: 1.0 Message-Id: <762581e4-a6bf-41d1-b0d3-72543153ffb1@www.fastmail.com> In-Reply-To: <20220909143236.sznwzkpedldrlnn5@box.shutemov.name> References: <20220706082016.2603916-1-chao.p.peng@linux.intel.com> <20220818132421.6xmjqduempmxnnu2@box> <20220820002700.6yflrxklmpsavdzi@box.shutemov.name> <95bd287b-d17f-fda8-58c9-20700b1e0c72@kernel.org> <20220909143236.sznwzkpedldrlnn5@box.shutemov.name> Date: Fri, 09 Sep 2022 12:11:05 -0700 From: "Andy Lutomirski" To: "Kirill A. Shutemov" Cc: "Kirill A . Shutemov" , "Hugh Dickins" , "Chao Peng" , "kvm list" , "Linux Kernel Mailing List" , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, "Linux API" , linux-doc@vger.kernel.org, qemu-devel@nongnu.org, linux-kselftest@vger.kernel.org, "Paolo Bonzini" , "Jonathan Corbet" , "Sean Christopherson" , "Vitaly Kuznetsov" , "Wanpeng Li" , "Jim Mattson" , "Joerg Roedel" , "Thomas Gleixner" , "Ingo Molnar" , "Borislav Petkov" , "the arch/x86 maintainers" , "H. Peter Anvin" , "Jeff Layton" , "J . Bruce Fields" , "Andrew Morton" , "Shuah Khan" , "Mike Rapoport" , "Steven Price" , "Maciej S . Szmigiero" , "Vlastimil Babka" , "Vishal Annapurve" , "Yu Zhang" , "Nakajima, Jun" , "Dave Hansen" , "Andi Kleen" , "David Hildenbrand" , aarcange@redhat.com, ddutile@redhat.com, dhildenb@redhat.com, "Quentin Perret" , "Michael Roth" , "Michal Hocko" , "Muchun Song" , "Gupta, Pankaj" Subject: Re: [PATCH v7 00/14] KVM: mm: fd-based approach for supporting KVM guest private memory Content-Type: text/plain ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1662750694; a=rsa-sha256; cv=none; b=i1l5sb6MrbV0eQ0WBOUA5GAWb4+0qKVzwQCogVjmvc5COCiLH49tXf7DKM7F6cnNUn+KiS H0fgIYjduR2JKE0icqtaRZiRIWwo9LIc6/WVZckWCnnq28wr8KvViB2c3JFbNnlwKVQRqx PB2bb2b7FG2C81MQzbJnZRht0XTGApA= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=NnfCE4rl; spf=pass (imf29.hostedemail.com: domain of luto@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=luto@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1662750694; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dwaV6nlu8jP8nCSH44r91oYFZCBpu9oBaKN4hO2yWbU=; b=its/mXTNiSnrjLsga6qejrfIVC7K03MWwK8pnkZsQvapfLdUduxod0vSmAvrJvo/bIv0AJ cmkiuo3ZcNIWd9g6xpvFry9i5pg5iCf6Xl2wIXZPzm0dyM9k3+d/GryUjT14rUg4MjQOZ1 8kE/zKYU6brJrbos8GuXwBDGb3vKlS0= X-Stat-Signature: ngc55urfkzshayt85w3d9o6pptpe9yna X-Rspamd-Queue-Id: 56E241200BC X-Rspam-User: Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=NnfCE4rl; spf=pass (imf29.hostedemail.com: domain of luto@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=luto@kernel.org; dmarc=pass (policy=none) header.from=kernel.org X-Rspamd-Server: rspam07 X-HE-Tag: 1662750694-899052 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Sep 9, 2022, at 7:32 AM, Kirill A . Shutemov wrote: > On Thu, Sep 08, 2022 at 09:48:35PM -0700, Andy Lutomirski wrote: >> On 8/19/22 17:27, Kirill A. Shutemov wrote: >> > On Thu, Aug 18, 2022 at 08:00:41PM -0700, Hugh Dickins wrote: >> > > On Thu, 18 Aug 2022, Kirill A . Shutemov wrote: >> > > > On Wed, Aug 17, 2022 at 10:40:12PM -0700, Hugh Dickins wrote: >> > > > > >> > > > > If your memory could be swapped, that would be enough of a good reason >> > > > > to make use of shmem.c: but it cannot be swapped; and although there >> > > > > are some references in the mailthreads to it perhaps being swappable >> > > > > in future, I get the impression that will not happen soon if ever. >> > > > > >> > > > > If your memory could be migrated, that would be some reason to use >> > > > > filesystem page cache (because page migration happens to understand >> > > > > that type of memory): but it cannot be migrated. >> > > > >> > > > Migration support is in pipeline. It is part of TDX 1.5 [1]. And swapping >> > > > theoretically possible, but I'm not aware of any plans as of now. >> > > > >> > > > [1] https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html >> > > >> > > I always forget, migration means different things to different audiences. >> > > As an mm person, I was meaning page migration, whereas a virtualization >> > > person thinks VM live migration (which that reference appears to be about), >> > > a scheduler person task migration, an ornithologist bird migration, etc. >> > > >> > > But you're an mm person too: you may have cited that reference in the >> > > knowledge that TDX 1.5 Live Migration will entail page migration of the >> > > kind I'm thinking of. (Anyway, it's not important to clarify that here.) >> > >> > TDX 1.5 brings both. >> > >> > In TDX speak, mm migration called relocation. See TDH.MEM.PAGE.RELOCATE. >> > >> >> This seems to be a pretty bad fit for the way that the core mm migrates >> pages. The core mm unmaps the page, then moves (in software) the contents >> to a new address, then faults it in. TDH.MEM.PAGE.RELOCATE doesn't fit into >> that workflow very well. I'm not saying it can't be done, but it won't just >> work. > > Hm. From what I see we have all necessary infrastructure in place. > > Unmaping is NOP for inaccessible pages as it is never mapped and we have > mapping->a_ops->migrate_folio() callback that allows to replace software > copying with whatever is needed, like TDH.MEM.PAGE.RELOCATE. > > What do I miss? Hmm, maybe this isn't as bad as I thought. Right now, unless I've missed something, the migration workflow is to unmap (via try_to_migrate) all mappings, then migrate the backing store (with ->migrate_folio(), although it seems like most callers expect the actual copy to happen outside of ->migrate_folio(), and then make new mappings. With the *current* (vma-based, not fd-based) model for KVM memory, this won't work -- we can't unmap before calling TDH.MEM.PAGE.RELOCATE. But maybe it's actually okay with some care or maybe mild modifications with the fd-based model. We don't have any mmaps, per se, to unmap for secret / INACCESSIBLE memory. So maybe we can get all the way to ->migrate_folio() without zapping anything in the secure EPT and just call TDH-MEM.PAGE.RELOCATE from inside migrate_folio(). And there will be nothing to fault back in. From the core code's perspective, it's like migrating a memfd that doesn't happen to have my mappings at the time. --Andy