From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C6F8CCA479 for ; Thu, 7 Jul 2022 20:08:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6EED46B0072; Thu, 7 Jul 2022 16:08:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 67879900002; Thu, 7 Jul 2022 16:08:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4F1706B0074; Thu, 7 Jul 2022 16:08:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 395CF6B0072 for ; Thu, 7 Jul 2022 16:08:11 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id F13C161090 for ; Thu, 7 Jul 2022 20:08:10 +0000 (UTC) X-FDA: 79661390340.28.C20FA54 Received: from mail-pg1-f176.google.com (mail-pg1-f176.google.com [209.85.215.176]) by imf13.hostedemail.com (Postfix) with ESMTP id 368DD20041 for ; Thu, 7 Jul 2022 20:08:09 +0000 (UTC) Received: by mail-pg1-f176.google.com with SMTP id q82so13277448pgq.6 for ; Thu, 07 Jul 2022 13:08:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=cFtvjAz52PfP2pPD+ElBW0/QBf2EF5ZKVG4poUQicBo=; b=mP6SCv4ZYKPXIVkeWPvSzYVLVZhUki25eBM3YjhHa4kfvUF6M24odaSonkI7GHxDgl q8l7jb+yebppC1HydPiKCkfhvmANyTWfCu52lX2gIwEpQBnE29Hb8QHYalq/BA13UOjv +Zyr+YT7RtoA7XbaOypkXR3a1booylX88aVw60q9E8Hgv256LIL7u283VOkjfSc/SkGk xBZgDfQP4w+8tYaciO9M9qkbnh1pzs5vOsjgxpOmnkZpqJo64Sb1mOLl2eYm3CP/azOl MjvEVEDBMn2QuzM5ShiwcRwk74T7S++IlrVOEUYGSFYHaFxUDcAizX/UA4mV7JIntHW7 0e3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=cFtvjAz52PfP2pPD+ElBW0/QBf2EF5ZKVG4poUQicBo=; b=CsotQPRjd57QjZLB0vObQfBIucyKp7JH757qZjLAU879jO4bSgcgxd2RKEQO3uKDdh hKbnyqAH3OfpGzm+8oR7xlFlGNaHX+yiNJ53Qlo0unhy6X05xrPJyD7w/l9Q8ihHjLkE GRkCSeFxsAg521vAlVKqRmKTvINi29tvaH/A2JGT8E5rRLSHl14liK2c5P0a5/Q7Im3B nfqB9HO4KPFC4A4rhLz2esWI5Z0oYyN3XK3TRPbKQ36z+brDBOY3+fTK6j9SuiG8UKYU CHwnlkLrWrNlJFgqPOxvtKPQxdQakXcbJEuIMeI9/NYnhSs21xztSrHcwZSKRDdkXd0g NI7w== X-Gm-Message-State: AJIora9aEY/TlnKoeM3cz/jJimPkCpgA/bBaznfDXznK+TJ3XhpmdKFO B1YWR+W0ThRrN3VlCrX11mLqVw== X-Google-Smtp-Source: AGRyM1vO+9LD36zGVYjZHKVETQF42VSn4ohks+4f4YrWxIjNuhp8m2LWNGsH0so1jBJkR3/GbUBuuQ== X-Received: by 2002:a17:902:d28a:b0:16b:e4d6:6534 with SMTP id t10-20020a170902d28a00b0016be4d66534mr23220212plc.68.1657224488885; Thu, 07 Jul 2022 13:08:08 -0700 (PDT) Received: from google.com (123.65.230.35.bc.googleusercontent.com. [35.230.65.123]) by smtp.gmail.com with ESMTPSA id w8-20020a627b08000000b00528c6c7bb65sm2244075pfc.83.2022.07.07.13.08.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Jul 2022 13:08:08 -0700 (PDT) Date: Thu, 7 Jul 2022 20:08:04 +0000 From: Sean Christopherson To: Xiaoyao Li Cc: Michael Roth , Vishal Annapurve , Chao Peng , "Nikunj A. Dadhania" , kvm list , LKML , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, qemu-devel@nongnu.org, Paolo Bonzini , Jonathan Corbet , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86 , "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Mike Rapoport , Steven Price , "Maciej S . Szmigiero" , Vlastimil Babka , Yu Zhang , "Kirill A . Shutemov" , Andy Lutomirski , Jun Nakajima , Dave Hansen , Andi Kleen , David Hildenbrand , aarcange@redhat.com, ddutile@redhat.com, dhildenb@redhat.com, Quentin Perret , mhocko@suse.com Subject: Re: [PATCH v6 6/8] KVM: Handle page fault for private memory Message-ID: References: <20220519153713.819591-1-chao.p.peng@linux.intel.com> <20220519153713.819591-7-chao.p.peng@linux.intel.com> <20220624090246.GA2181919@chaop.bj.intel.com> <20220630222140.of4md7bufd5jv5bh@amd.com> <4fe3b47d-e94a-890a-5b87-6dfb7763bc7e@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4fe3b47d-e94a-890a-5b87-6dfb7763bc7e@intel.com> ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=mP6SCv4Z; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf13.hostedemail.com: domain of seanjc@google.com designates 209.85.215.176 as permitted sender) smtp.mailfrom=seanjc@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1657224490; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cFtvjAz52PfP2pPD+ElBW0/QBf2EF5ZKVG4poUQicBo=; b=s8CGji/YRZOOKKOFYK1mIoMImJeCt4q1iQ9Ws4zu0BkFracH8716lkde1BpkfQDyRokDrG rvlOGVny9BBcny+Vr9UHZNULa68U+uwnWrz+D36Vet0JteSGziKOSF+tdJSGH5skz41AKl qOd5rXaCzgV4gkPzfBEeNs0yzT4sKdo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1657224490; a=rsa-sha256; cv=none; b=q9BxXtUY2FIP66w6YOxd3zq6appo3e/UpevzPM5f6gqiIDCEftZ6GXLgIaGliN8qDenhGJ 1rL7S/alD6RYlS+K6GhZ+WlqiRi4xjyPsopJIqs+Dp1go9TSC57ZJTbhjmP7DRfSxjwhsH iL020YCNAF6g67bYwe4KleacM5UaSkg= X-Stat-Signature: yijrtmbyjhiemu3t1s4ammya87gj76tt X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 368DD20041 Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=mP6SCv4Z; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf13.hostedemail.com: domain of seanjc@google.com designates 209.85.215.176 as permitted sender) smtp.mailfrom=seanjc@google.com X-Rspam-User: X-HE-Tag: 1657224489-647652 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jul 01, 2022, Xiaoyao Li wrote: > On 7/1/2022 6:21 AM, Michael Roth wrote: > > On Thu, Jun 30, 2022 at 12:14:13PM -0700, Vishal Annapurve wrote: > > > With transparent_hugepages=always setting I see issues with the > > > current implementation. ... > > > Looks like with transparent huge pages enabled kvm tried to handle the > > > shared memory fault on 0x84d gfn by coalescing nearby 4K pages > > > to form a contiguous 2MB page mapping at gfn 0x800, since level 2 was > > > requested in kvm_mmu_spte_requested. > > > This caused the private memory contents from regions 0x800-0x84c and > > > 0x86e-0xa00 to get unmapped from the guest leading to guest vm > > > shutdown. > > > > Interesting... seems like that wouldn't be an issue for non-UPM SEV, since > > the private pages would still be mapped as part of that 2M mapping, and > > it's completely up to the guest as to whether it wants to access as > > private or shared. But for UPM it makes sense this would cause issues. > > > > > > > > Does getting the mapping level as per the fault access type help > > > address the above issue? Any such coalescing should not cross between > > > private to > > > shared or shared to private memory regions. > > > > Doesn't seem like changing the check to fault->is_private would help in > > your particular case, since the subsequent host_pfn_mapping_level() call > > only seems to limit the mapping level to whatever the mapping level is > > for the HVA in the host page table. > > > > Seems like with UPM we need some additional handling here that also > > checks that the entire 2M HVA range is backed by non-private memory. > > > > Non-UPM SNP hypervisor patches already have a similar hook added to > > host_pfn_mapping_level() which implements such a check via RMP table, so > > UPM might need something similar: > > > > https://github.com/AMDESE/linux/commit/ae4475bc740eb0b9d031a76412b0117339794139 > > > > -Mike > > > > For TDX, we try to track the page type (shared, private, mixed) of each gfn > at given level. Only when the type is shared/private, can it be mapped at > that level. When it's mixed, i.e., it contains both shared pages and private > pages at given level, it has to go to next smaller level. > > https://github.com/intel/tdx/commit/ed97f4042eb69a210d9e972ccca6a84234028cad Hmm, so a new slot->arch.page_attr array shouldn't be necessary, KVM can instead update slot->arch.lpage_info on shared<->private conversions. Detecting whether a given range is partially mapped could get nasty if KVM defers tracking to the backing store, but if KVM itself does the tracking as was previously suggested[*], then updating lpage_info should be relatively straightfoward, e.g. use xa_for_each_range() to see if a given 2mb/1gb range is completely covered (fully shared) or not covered at all (fully private). [*] https://lore.kernel.org/all/YofeZps9YXgtP3f1@google.com