From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1AB0BC3DA7D for ; Tue, 3 Jan 2023 23:06:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 933768E0002; Tue, 3 Jan 2023 18:06:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E3DE8E0001; Tue, 3 Jan 2023 18:06:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 784988E0002; Tue, 3 Jan 2023 18:06:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 6695E8E0001 for ; Tue, 3 Jan 2023 18:06:44 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 3BF9DAAF57 for ; Tue, 3 Jan 2023 23:06:44 +0000 (UTC) X-FDA: 80315024328.09.5296E91 Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) by imf14.hostedemail.com (Postfix) with ESMTP id 960C410000D for ; Tue, 3 Jan 2023 23:06:42 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=h9725tfa; spf=pass (imf14.hostedemail.com: domain of seanjc@google.com designates 209.85.216.45 as permitted sender) smtp.mailfrom=seanjc@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672787202; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wms9vJRG39Hnpaj8L9AdTUEiKoOujl8Zc1PlfJpuAvc=; b=a0QPD/1K+i/LabIYMYW4i0e80FfLiO7QBuwJqwjB+NA2kwI/X4wXY2opTU09uowrM2izqm 4Gsjdajcg36WweWMss3A/X1PDQ+YYyRzkyTXHEsDWWPcbYsLSeJDOeRI/ZgTXjxeRlh3zw VIQSFIWvMkWHqw9iGm7LplxtZsdjp+I= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=h9725tfa; spf=pass (imf14.hostedemail.com: domain of seanjc@google.com designates 209.85.216.45 as permitted sender) smtp.mailfrom=seanjc@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672787202; a=rsa-sha256; cv=none; b=Vt+I9Cv4F1OriDlXJlF9nupTAGT0fQ5O5sQWF1n4pvop9B/2mVu9Hp/n5LydcBHpUoV/ru HugZtlCc9FS42zBtFu6aHjZug9auRIPrmB2XSxb46wh2drqLDapRhOdkN0sSzKo5dRCU3b pK7OkPbHMvk8XMUW4KnSvkVjz+SGDhI= Received: by mail-pj1-f45.google.com with SMTP id a11-20020a17090a740b00b00223f7eba2c4so4896390pjg.5 for ; Tue, 03 Jan 2023 15:06:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=wms9vJRG39Hnpaj8L9AdTUEiKoOujl8Zc1PlfJpuAvc=; b=h9725tfa5hdVaq1HRKMg38pT3bnSdsZu1gYJxnR5d5+gbg52PuYQIv5wBFlDJhO9bS mI84Q+hhI20dHQcSkgxSkQf4wP4qmQP1LNIFkAq/VomPx4jdE8OPYacqy/eeBE+fylf/ swL/aH5bYQ07AmejSAYJHcZh07QDgP5xrRt6bm8Q1etUwfrKAFos+bdLOE5+JE6PBi3U sFDQ9Dps9Lxs/yyXs3jQijpfnjXJqK66rKL3UJSPZwrJDKJketPNn4qX/18FtXZzHbno RPLBoKWopziiN34rcQN0qmYINuJwWj5jMTrSB3yTnXpqqbcBJ6htLtlQbhw4piMR+QeH MFPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=wms9vJRG39Hnpaj8L9AdTUEiKoOujl8Zc1PlfJpuAvc=; b=aSleRu+MOA5mexUeLwyk2bOo29d9x3PR4wHt22LnY5hVPGTR9yWpUmIVnrDFRdYc8f 4vBYdcX6Ie3R38hatI0Dd0X8jfln93uu1oejbwYBCVNmf250cFMiOrZEGCtkoFiL8pv2 fHKyH1UGzSeg3y3nZjxOaMfJwji21EqgUWSUFu40586M1Fxi2yixOFWgeN6FYvQXtJSy POd46qJUMu+ZUfCKSARhimjYCO0/OFFsYO/TfmMvUNcQS665trP5J5WU9ewM1XH8SR+k pB3Tw6W25bfOcRg4seAJUx/OlFlnfAw3Sy+gHYt6ky4jqnSSriT54Uls0H757R11GZtn gNYA== X-Gm-Message-State: AFqh2kpqqF5CvrSUp9Hvk3bP5P1Ukppjk49+MJDgy+he/t1GsqGoF0Ud jy1bjLiok+7ftnLEdB/h8P31vQ== X-Google-Smtp-Source: AMrXdXv0hPqADj10or2SDVYag+quux5jra8KVSiEDlbg50PLMvdP+6Y7KrZdfBMvhypOkMi3Sz3H1g== X-Received: by 2002:a05:6a20:2a9f:b0:a4:efde:2ed8 with SMTP id v31-20020a056a202a9f00b000a4efde2ed8mr5044243pzh.0.1672787201272; Tue, 03 Jan 2023 15:06:41 -0800 (PST) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id b27-20020aa7951b000000b00580c8a15d13sm19479380pfp.11.2023.01.03.15.06.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Jan 2023 15:06:40 -0800 (PST) Date: Tue, 3 Jan 2023 23:06:37 +0000 From: Sean Christopherson To: "Wang, Wei W" Cc: Chao Peng , "Qiang, Chenyi" , "kvm@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "linux-fsdevel@vger.kernel.org" , "linux-arch@vger.kernel.org" , "linux-api@vger.kernel.org" , "linux-doc@vger.kernel.org" , "qemu-devel@nongnu.org" , Paolo Bonzini , Jonathan Corbet , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Arnd Bergmann , Naoya Horiguchi , Miaohe Lin , "x86@kernel.org" , "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Shuah Khan , Mike Rapoport , Steven Price , "Maciej S . Szmigiero" , Vlastimil Babka , Vishal Annapurve , Yu Zhang , "Kirill A . Shutemov" , "Lutomirski, Andy" , "Nakajima, Jun" , "Hansen, Dave" , "ak@linux.intel.com" , "david@redhat.com" , "aarcange@redhat.com" , "ddutile@redhat.com" , "dhildenb@redhat.com" , Quentin Perret , "tabba@google.com" , Michael Roth , "Hocko, Michal" Subject: Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes Message-ID: References: <20221202061347.1070246-1-chao.p.peng@linux.intel.com> <20221202061347.1070246-3-chao.p.peng@linux.intel.com> <1c9bbaa5-eea3-351e-d6a0-cfbc32115c82@intel.com> <20230103013948.GA2178318@chaop.bj.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 960C410000D X-Rspam-User: X-Stat-Signature: pscsajxjf3ij13qm3trjc1um5ot4jr85 X-HE-Tag: 1672787202-24840 X-HE-Meta: U2FsdGVkX19o/ni1yj54/f4kBlKGOKvrFSAgm1WjKh7WxrlLhs1YzAMvAoX/y1hyJ4n23JMXufV25WMsJr9eTt53Fqa4cebiFzXL/RrOZtLjzeRRL3NwWLaVIdQJLFxlGUdaGme/SVuHtaRVBf4/B6ouYOCIZI9Spb2QuyFoMZ2Mh5+cL3I07O30amQKTC6/jF9G4KVd/r3aekdNPQhdeWNrsmZW8mataBJkYLTGTDIN8b178D38hxwdnQyIL/G5/XdjuJhwUx3bE73qjSeBo80Iq596zgtAXPMm1B7sseMwJgRedXGEaSejEXWAHdBxPi0L7KPdjC1jUEbp69mW3mstWmlrtai2AHXhwp6QDdsSI3mc0HOrfPx63IC59fI/9ggMjE7i8gD2fmuEqsPBz7rpx5C8DVW2IFby5wS2DcgfQ65xX1wMORW7gWlRZjeoj7IdTEgD5HRZSgIuQzpUqoBr8ewD3FgIGdIIoQD9avJFjBm0ze/lrImdA5mUcPbi2LsDL9tssOCqGkBlesb+kQ5dmhEDOWbv5i15bG61F/1lQoTaB1gSNqQNete4mKUXPmS306SZwl/Shfz6VVSvgDMZTmyLjk4x8YMhgsa4BkYj58JBEGHP9YnhwsMYqBn1RbNrgXHBXRv/r87vys92uUHyx3hB1WZn03XAw90IN8fOKPDYLSSPWC9Q9J8NitaP3Qnr7ugnk3MR6ChQk6VNXQ2+Sg1jbg4QOD/hesLwr3AI+b+VGSA64SgQX3LaX2bhYtILUYbUNFhjtUIJKgKkL4UXfPanc2NH3WRvYKev4iUqiMdFXYNvq1Ks1v86JH/xSP4eVriEMKepUsZeIumfLiorkDXm5j/O66mgIP87dUNTbT/2LsA4iG1gYdGuDr51t8wAMh8teE1xuzLxtcWZDEZ/8iTEQxuhb73DL2ca9MsUaWug6OasJ8068y7xfGF6M66rSQfNn0zuDaSYFF5 TnKGaJu8 kr4HO5HgI7VaxYe7EA5ODF6qZO5Err+7ojJ2g7QOealQTgvw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jan 03, 2023, Wang, Wei W wrote: > On Tuesday, January 3, 2023 9:40 AM, Chao Peng wrote: > > > Because guest memory defaults to private, and now this patch stores > > > the attributes with KVM_MEMORY_ATTRIBUTE_PRIVATE instead of > > _SHARED, > > > it would bring more KVM_EXIT_MEMORY_FAULT exits at the beginning of > > > boot time. Maybe it can be optimized somehow in other places? e.g. set > > > mem attr in advance. > > > > KVM defaults to 'shared' because this ioctl can also be potentially used by > > normal VMs and 'shared' sounds a value meaningful for both normal VMs and > > confidential VMs. > > Do you mean a normal VM could have pages marked private? What's the usage? > (If all the pages are just marked shared for normal VMs, then why do we need it) No, there are potential use cases for per-page attribute/permissions, e.g. to make select pages read-only, exec-only, no-exec, etc... > > As for more KVM_EXIT_MEMORY_FAULT exits during the > > booting time, yes, setting all memory to 'private' for confidential VMs through > > this ioctl in userspace before guest launch is an approach for KVM userspace to > > 'override' the KVM default and reduce the number of implicit conversions. > > Most pages of a confidential VM are likely to be private pages. It seems more efficient > (and not difficult to check vm_type) to have KVM defaults to "private" for confidential VMs > and defaults to "shared" for normal VMs. If done right, the default shouldn't matter all that much for efficiency. KVM needs to be able to effeciently track large ranges regardless of the default, otherwise the memory overhead and the presumably cost of lookups will be painful. E.g. converting a 1GiB chunk to shared should ideally require one entry, not 256k entries. Looks like that behavior was changed in v8 in response to feedback[*] that doing xa_store_range() on a subset of an existing range (entry) would overwrite the entire existing range (entry), not just the smaller subset. xa_store_range() does appear to be too simplistic for this use case, but looking at __filemap_add_folio(), splitting an existing entry isn't super complex. Using xa_store() for the very initial implementation is ok, and probably a good idea since it's more obviously correct and will give us a bisection point. But we definitely want a more performant implementation sooner than later. The hardest part will likely be merging existing entries, but that can be done separately too, and is probably lower priority. E.g. (1) use xa_store() and always track at 4KiB granularity, (2) support storing metadata in multi-index entries, and finally (3) support merging adjacent entries with identical values. [*] https://lore.kernel.org/all/CAGtprH9xyw6bt4=RBWF6-v2CSpabOCpKq5rPz+e-9co7EisoVQ@mail.gmail.com