From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B282C77B61 for ; Thu, 13 Apr 2023 16:04:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C651B6B0072; Thu, 13 Apr 2023 12:04:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C14336B0074; Thu, 13 Apr 2023 12:04:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AB57F900002; Thu, 13 Apr 2023 12:04:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 9D9096B0072 for ; Thu, 13 Apr 2023 12:04:17 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 2F5B0C031B for ; Thu, 13 Apr 2023 16:04:17 +0000 (UTC) X-FDA: 80676839754.03.7B3990C Received: from new2-smtp.messagingengine.com (new2-smtp.messagingengine.com [66.111.4.224]) by imf19.hostedemail.com (Postfix) with ESMTP id C37F11A0022 for ; Thu, 13 Apr 2023 16:04:13 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm1 header.b="g StG2nx"; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=ZhPxOavX; spf=pass (imf19.hostedemail.com: domain of kirill@shutemov.name designates 66.111.4.224 as permitted sender) smtp.mailfrom=kirill@shutemov.name; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681401854; a=rsa-sha256; cv=none; b=IlOoyaGLyciTnxwf7yc6sxqjePRitvLJFF08RNkbew9gwj+Zn9IfAgAgDeWSuhvrLfKGlj bvW8eIOAt4eGEBGaA8jPvPBLUAOJ3stw3MqGGkSP1du4di/Hjnu45vUC42bGTigZlB4XII 9FNe3qSKPfUf/TIOczGmVrHeFiSVqiI= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm1 header.b="g StG2nx"; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=ZhPxOavX; spf=pass (imf19.hostedemail.com: domain of kirill@shutemov.name designates 66.111.4.224 as permitted sender) smtp.mailfrom=kirill@shutemov.name; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681401854; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yaqCPjOmT4SPOlCAxznT+4NGON1tEotsmgkNB1VGB30=; b=WLNORVdsYJPbfjDsftj2yM1O1KIxg59qLNrH25NRNkBiwwQ8wP48awlzMm8IR9pJiB9wG4 KBmjt5TEtBzq9H/ZMng3zCYnXjCkdGNzVaQ1MKZ7T7Dex7nYDEux3U6uy7vpj18zLqwX96 PSYEtXPp/ATFzhBFwe4cXX7nTC+oZLo= Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailnew.nyi.internal (Postfix) with ESMTP id D8DC95821BE; Thu, 13 Apr 2023 12:04:12 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Thu, 13 Apr 2023 12:04:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-type:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm1; t=1681401852; x= 1681409052; bh=yaqCPjOmT4SPOlCAxznT+4NGON1tEotsmgkNB1VGB30=; b=g StG2nxVku9iBD1WQagIAjbs9eRtIqZQ4YJ7OEYoZjd8lDxrtEdHK9TqLOIeSqwI/ aRHxz87Qz45IxYCW54qgzCDISNjcqUTZEpEk1pt3VfmLnVVwuAPniOJ3FRodfKpI DEMHWreitIkO4k4mgbAot366VgZ2vWimK9fKuI1RKhuY+/rU3uUPOowpp+5JdmEA GgTYoBTSrgFdi7qpYlbquOANjGur+Ee0B0x1e9mN71cQweG+Ik8UL5j2jOOOdlxN wytXUgUVh1bYTkhj1s4j/Imizt257d5R/RoBcvHrgEciqzigMyAbV50wz4zIs+/h ZvAAYdSwUpPn20YgAC9eA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm3; t=1681401852; x=1681409052; bh=yaqCPjOmT4SPO lCAxznT+4NGON1tEotsmgkNB1VGB30=; b=ZhPxOavXVNafxjBvoz0rHGSKV5lrB lBBeFJbThUiGvUr6dkUdUFPZBpFZ459KtreUM8/tmgj+GOIkARrz+/dxN2rCFKeh IpN6sEvHrSODWjpvtdkpXDz1uk3XIMGrQ83IpXODAFKPMkKNE1syCCeenvY3ochU sFkgoLNTQaB5KIJLTO17KkMdHMrJ9E9VdZJo8sSy642MoVNJ5ikM5yQQ28lztjFD TiMX1SDBBjQlQM4qnK05EPhjiIvKgDfGSNZek/aZtbj2uyFA98dRfxTSn3BEVOR9 95NphpJRdnFA0NLiN2ijCyMAXgwwHtzwaDSgo3Z/2Vu+j3IZFBGr1TvRQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrvdekkedgleejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesthdttddttddtvdenucfhrhhomhepfdfmihhr ihhllhcutedrucfuhhhuthgvmhhovhdfuceokhhirhhilhhlsehshhhuthgvmhhovhdrnh grmhgvqeenucggtffrrghtthgvrhhnpeetvdehffelffeiveeikeduffetudeuheeiiefg ueduvdevtdejhedvhfffffehfeenucffohhmrghinhepghhithhhuhgsrdgtohhmnecuve hluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepkhhirhhilhhl sehshhhuthgvmhhovhdrnhgrmhgv X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 13 Apr 2023 12:04:09 -0400 (EDT) Received: by box.shutemov.name (Postfix, from userid 1000) id B267B10D7C6; Thu, 13 Apr 2023 19:04:05 +0300 (+03) Date: Thu, 13 Apr 2023 19:04:05 +0300 From: "Kirill A. Shutemov" To: Sean Christopherson Cc: Liam Merwick , Chao Peng , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, qemu-devel@nongnu.org, Paolo Bonzini , Jonathan Corbet , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Arnd Bergmann , Naoya Horiguchi , Miaohe Lin , x86@kernel.org, "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Shuah Khan , Mike Rapoport , Steven Price , "Maciej S . Szmigiero" , Vlastimil Babka , Vishal Annapurve , Yu Zhang , "Kirill A . Shutemov" , luto@kernel.org, jun.nakajima@intel.com, dave.hansen@intel.com, ak@linux.intel.com, david@redhat.com, aarcange@redhat.com, ddutile@redhat.com, dhildenb@redhat.com, Quentin Perret , tabba@google.com, Michael Roth , mhocko@suse.com, wei.w.wang@intel.com Subject: Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM Message-ID: <20230413160405.h6ov2yl6l3i7mvsj@box.shutemov.name> References: <20221202061347.1070246-1-chao.p.peng@linux.intel.com> <48953bf2-cee9-f818-dc50-5fb5b9b410bf@oracle.com> <20230125125321.yvsivupbbaqkb7a5@box.shutemov.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Queue-Id: C37F11A0022 X-Rspamd-Server: rspam01 X-Stat-Signature: ytz3m1kapwua4wts1sc588461gc63z6d X-HE-Tag: 1681401853-420275 X-HE-Meta: U2FsdGVkX19hHO4xQ6q7jG3tj0N0GVtBGqS7Ldbhzk1tT4oQP/aqo3yT8EV7k6jv226ZaITqg9uKygUCPWoAIgSbc7x9CLGeodDzcq9ba6qoHfdXvGeb5lAPmCn7+OBCzFr9XKgfNS2IZVLatgG3Sa8v2/00Rny1RLJNNsZPisZXDy6ecY4bSmNqH4nxwnMGgm0VnQRVhRjuofzPqUhua+yt1Li23fyfVwLajbATaZA/p8lNKN5tLVABNc0zoPySOxqnBJLwyXnt5eqEcimIK7+af6XaLF4G8LBpc1+S6O3B9LbpDJiTNQDql1ckc1wbx+4AlI7H4H4UBkfTSOk20B8E2Fu4H2qYgPDnieb7Hg2g91HjGhqbweFP2DesZnF2H/3g/f6XwU/BbMSagmThTDVWCkEA8rWwqIQHqUrO2+/NHK6doe/uys4sbQlHgbJJMZ9I91UWCgBpHqJupgHWf6IxJOMIdb4bcvAC5HhMzFYrPHD5zxwmJl7IQ4wwt43r3TTOQpmGMQUOXHxN9RDvEnstHWTSNIXvFTHcszEub3edyzCIheTvpIgYGfQW6yip+yr+qxVrD85H4CoEiO93/D1WrzDY6fdSH6B33giQWp/25kFBUhNIexYz02J9bj7xtt19u/D+IuRV35tDkq/gsYi00nHkkYjJiT8i/JVTsqAyViJx33Bo1pgv+lpmRsLAEYIpsp3REWaxNSKeugnQrcWpt087gPR2cIO+smstm2Ri01Zt//OHKaogIQMgPRkEDePCARGSSMCHO1Lwr2DROQUqEhfs7K6tXOCZA/rWkQEPk6Fyi4EwtakjR1ngGwHM6cGKPPUlFiUy7zOhnfUruFDRhtVYHDtd/ZA45im/KFhPetYKWg1xtM6qU5646sacMKnjTm8xWGVauoDRllAA7iv4e5Vy7A0PEuH4+6VA1MDeFlUVzK298kr1RivgxdH535kRpNJwuIgXj3Tq5bN hzPVN3gd 5DToZKhuQJvHnV7a5sDKELrLrRijlsMvw4bCqJVjEg1xW0i8JRyxi3cHVLw45mRSEPlO03ZvZi4tM4J9jf7uk3dRQyq0SkMtitQWDnDYuhDaCtyT5RGi/JHAd9OT0Y/RGV52GDTpp0t6NAbA6RcoCo2Xi1IbVK5NT01GweO8w5r+qTPEsRPGZvdmznxe+T29aPWW9pA1ZofSL6X/VO4Kt8EelU+zubRAdhFwodLS8wR1jgn4y+DLa/+qjWOR1G9CByVcWZpDceH2yctQhWz25CdaboqVEuL3D0gzITnTF93fsRr3tDWrHi4U6sYONWkrA2q5XWLKb9WaBmaQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 12, 2023 at 06:07:28PM -0700, Sean Christopherson wrote: > On Wed, Jan 25, 2023, Kirill A. Shutemov wrote: > > On Wed, Jan 25, 2023 at 12:20:26AM +0000, Sean Christopherson wrote: > > > On Tue, Jan 24, 2023, Liam Merwick wrote: > > > > On 14/01/2023 00:37, Sean Christopherson wrote: > > > > > On Fri, Dec 02, 2022, Chao Peng wrote: > > > > > > This patch series implements KVM guest private memory for confidential > > > > > > computing scenarios like Intel TDX[1]. If a TDX host accesses > > > > > > TDX-protected guest memory, machine check can happen which can further > > > > > > crash the running host system, this is terrible for multi-tenant > > > > > > configurations. The host accesses include those from KVM userspace like > > > > > > QEMU. This series addresses KVM userspace induced crash by introducing > > > > > > new mm and KVM interfaces so KVM userspace can still manage guest memory > > > > > > via a fd-based approach, but it can never access the guest memory > > > > > > content. > > > > > > > > > > > > The patch series touches both core mm and KVM code. I appreciate > > > > > > Andrew/Hugh and Paolo/Sean can review and pick these patches. Any other > > > > > > reviews are always welcome. > > > > > > - 01: mm change, target for mm tree > > > > > > - 02-09: KVM change, target for KVM tree > > > > > > > > > > A version with all of my feedback, plus reworked versions of Vishal's selftest, > > > > > is available here: > > > > > > > > > > git@github.com:sean-jc/linux.git x86/upm_base_support > > > > > > > > > > It compiles and passes the selftest, but it's otherwise barely tested. There are > > > > > a few todos (2 I think?) and many of the commits need changelogs, i.e. it's still > > > > > a WIP. > > > > > > > > > > > > > When running LTP (https://github.com/linux-test-project/ltp) on the v10 > > > > bits (and also with Sean's branch above) I encounter the following NULL > > > > pointer dereference with testcases/kernel/syscalls/madvise/madvise01 > > > > (100% reproducible). > > > > > > > > It appears that in restrictedmem_error_page() > > > > inode->i_mapping->private_data is NULL in the > > > > list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) but I > > > > don't know why. > > > > > > Kirill, can you take a look? Or pass the buck to someone who can? :-) > > > > The patch below should help. > > > > diff --git a/mm/restrictedmem.c b/mm/restrictedmem.c > > index 15c52301eeb9..39ada985c7c0 100644 > > --- a/mm/restrictedmem.c > > +++ b/mm/restrictedmem.c > > @@ -307,14 +307,29 @@ void restrictedmem_error_page(struct page *page, struct address_space *mapping) > > > > spin_lock(&sb->s_inode_list_lock); > > list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) { > > - struct restrictedmem *rm = inode->i_mapping->private_data; > > struct restrictedmem_notifier *notifier; > > - struct file *memfd = rm->memfd; > > + struct restrictedmem *rm; > > unsigned long index; > > + struct file *memfd; > > > > - if (memfd->f_mapping != mapping) > > + if (atomic_read(&inode->i_count)) > > Kirill, should this be > > if (!atomic_read(&inode->i_count)) > continue; > > i.e. skip unreferenced inodes, not skip referenced inodes? Ouch. Yes. But looking at other instances of s_inodes usage, I think we can drop the check altogether. inode cannot be completely free until it is removed from s_inodes list. While there, replace list_for_each_entry_safe() with list_for_each_entry() as we don't remove anything from the list. diff --git a/mm/restrictedmem.c b/mm/restrictedmem.c index 55e99e6c09a1..8e8a4420d3d1 100644 --- a/mm/restrictedmem.c +++ b/mm/restrictedmem.c @@ -194,22 +194,19 @@ static int restricted_error_remove_page(struct address_space *mapping, struct page *page) { struct super_block *sb = restrictedmem_mnt->mnt_sb; - struct inode *inode, *next; + struct inode *inode; pgoff_t start, end; start = page->index; end = start + thp_nr_pages(page); spin_lock(&sb->s_inode_list_lock); - list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) { + list_for_each_entry(inode, &sb->s_inodes, i_sb_list) { struct restrictedmem_notifier *notifier; struct restrictedmem *rm; unsigned long index; struct file *memfd; - if (atomic_read(&inode->i_count)) - continue; - spin_lock(&inode->i_lock); if (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE)) { spin_unlock(&inode->i_lock); -- Kiryl Shutsemau / Kirill A. Shutemov