From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB2B1C00140 for ; Fri, 5 Aug 2022 23:05:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241545AbiHEXFa (ORCPT ); Fri, 5 Aug 2022 19:05:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59558 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238665AbiHEXF0 (ORCPT ); Fri, 5 Aug 2022 19:05:26 -0400 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2D2C31147A for ; Fri, 5 Aug 2022 16:05:24 -0700 (PDT) Received: by mail-pf1-x449.google.com with SMTP id b11-20020aa7810b000000b0052aeb125cb3so1783905pfi.11 for ; Fri, 05 Aug 2022 16:05:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:reply-to:from:to:cc; bh=mFUF1YXrRbDDczd6BbaTXj4rrrJUlppqKSMgMMo2ORU=; b=ZBRM95/VxLzLXI5Ah3DYM++BJhXadZKNC1i6TIiU6X6PCqaQrXv+hRCY6lbBKczoqw jfG7znD0490L7D6hl8FL2APaIeRzXYc0G/ZmDvjhc4uaNz0eBPzoPBtN5nFoPzzBCxJU a4a722tNm5CSq/H5QnJuVwJaiAeWIfv3t73C2e5QDhNJdxH8IR6qoExQXpV9glvJnkqJ U/59OmTAvwUnh2lC1psVmX5ZNbQLbEwYyRvu3pMVI85PgxPooUztB+yCXUH+8pEnEVk+ Bt/h/wKaHXCMC89or55gDgXOxRXSWq8ptNwCoJ7M/bpNsEivJNMArG672+9USpTXqtgB d9Rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc; bh=mFUF1YXrRbDDczd6BbaTXj4rrrJUlppqKSMgMMo2ORU=; b=SfQZ9iM3ChFWtWtEXX3IZnI9pFM/kVmTR7Kd0HeXviChIxrNC2CeIz7jHCmWabOw+t UaXPOqbSMSGXaZ1H1KClRpCbm7eJaymI494a9jQKdiAyPRLLQPaeW/cMOhRgH7TZ9DR6 jlqkc3D4NgxBkwT0msGsT/PoO3Gu8T6wyfr1dVv67k8sq57jxgDORHMtCyXNdWMPOgfx pqqpyXC+OIcijNQ+8H4nH/cBVgbq8iyx/MOvHb0iAcSRSFTcrkedp01Xs9mrrNep14n0 qEYH5g44Tru1cvYuyQiUAljnU1/bbVJh2+raIwo5IiL5VOzeCN7KKYn4QMrlOy2eic5T o47w== X-Gm-Message-State: ACgBeo0+LUeMGrBF0x6SCYoXWweAKTNAHim3Ekt6yiXfKJF1vq/KERTU il9Uxo2LEwTJwhR+BsmLjErrDDSfJRY= X-Google-Smtp-Source: AA6agR7xf2hc36UYUl74VM07zEr3ZgeM0ZUqWOJI0fG7BHFwTzBnEVpoKnh9bd/zQtvNo9AAlzrqd8zhvoU= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:90a:d3d8:b0:1f2:2cd:a1ca with SMTP id d24-20020a17090ad3d800b001f202cda1camr18446417pjw.135.1659740723749; Fri, 05 Aug 2022 16:05:23 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 5 Aug 2022 23:05:07 +0000 In-Reply-To: <20220805230513.148869-1-seanjc@google.com> Message-Id: <20220805230513.148869-3-seanjc@google.com> Mime-Version: 1.0 References: <20220805230513.148869-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.559.g78731f0fdb-goog Subject: [PATCH v3 2/8] KVM: x86/mmu: Tag disallowed NX huge pages even if they're not tracked From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, David Matlack , Yan Zhao , Mingwei Zhang , Ben Gardon Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Tag shadow pages that cannot be replaced with an NX huge page regardless of whether or not zapping the page would allow KVM to immediately create a huge page, e.g. because something else prevents creating a huge page. I.e. track pages that are disallowed from being NX huge pages regardless of whether or not the page could have been huge at the time of fault. KVM currently tracks pages that were disallowed from being huge due to the NX workaround if and only if the page could otherwise be huge. But that fails to handled the scenario where whatever restriction prevented KVM from installing a huge page goes away, e.g. if dirty logging is disabled, the host mapping level changes, etc... Failure to tag shadow pages appropriately could theoretically lead to false negatives, e.g. if a fetch fault requests a small page and thus isn't tracked, and a read/write fault later requests a huge page, KVM will not reject the huge page as it should. To avoid yet another flag, initialize the list_head and use list_empty() to determine whether or not a page is on the list of NX huge pages that should be recovered. Note, the TDP MMU accounting is still flawed as fixing the TDP MMU is more involved due to mmu_lock being held for read. This will be addressed in a future commit. Fixes: 5bcaf3e1715f ("KVM: x86/mmu: Account NX huge page disallowed iff huge page was requested") Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 27 +++++++++++++++++++-------- arch/x86/kvm/mmu/mmu_internal.h | 10 +++++++++- arch/x86/kvm/mmu/paging_tmpl.h | 6 +++--- arch/x86/kvm/mmu/tdp_mmu.c | 4 +++- 4 files changed, 34 insertions(+), 13 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 36b898dbde91..55dac44f3397 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -802,15 +802,20 @@ static void account_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) kvm_flush_remote_tlbs_with_address(kvm, gfn, 1); } -void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp) +void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp, + bool nx_huge_page_possible) { - if (KVM_BUG_ON(sp->lpage_disallowed, kvm)) + if (KVM_BUG_ON(!list_empty(&sp->lpage_disallowed_link), kvm)) + return; + + sp->lpage_disallowed = true; + + if (!nx_huge_page_possible) return; ++kvm->stat.nx_lpage_splits; list_add_tail(&sp->lpage_disallowed_link, &kvm->arch.lpage_disallowed_mmu_pages); - sp->lpage_disallowed = true; } static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) @@ -832,9 +837,13 @@ static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp) { - --kvm->stat.nx_lpage_splits; sp->lpage_disallowed = false; - list_del(&sp->lpage_disallowed_link); + + if (list_empty(&sp->lpage_disallowed_link)) + return; + + --kvm->stat.nx_lpage_splits; + list_del_init(&sp->lpage_disallowed_link); } static struct kvm_memory_slot * @@ -2115,6 +2124,8 @@ static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm *kvm, set_page_private(virt_to_page(sp->spt), (unsigned long)sp); + INIT_LIST_HEAD(&sp->lpage_disallowed_link); + /* * active_mmu_pages must be a FIFO list, as kvm_zap_obsolete_pages() * depends on valid pages being added to the head of the list. See @@ -3112,9 +3123,9 @@ static int __direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) continue; link_shadow_page(vcpu, it.sptep, sp); - if (fault->is_tdp && fault->huge_page_disallowed && - fault->req_level >= it.level) - account_huge_nx_page(vcpu->kvm, sp); + if (fault->is_tdp && fault->huge_page_disallowed) + account_huge_nx_page(vcpu->kvm, sp, + fault->req_level >= it.level); } if (WARN_ON_ONCE(it.level != fault->goal_level)) diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 582def531d4d..cca1ad75d096 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -100,6 +100,13 @@ struct kvm_mmu_page { }; }; + /* + * Tracks shadow pages that, if zapped, would allow KVM to create an NX + * huge page. A shadow page will have lpage_disallowed set but not be + * on the list if a huge page is disallowed for other reasons, e.g. + * because KVM is shadowing a PTE at the same gfn, the memslot isn't + * properly aligned, etc... + */ struct list_head lpage_disallowed_link; #ifdef CONFIG_X86_32 /* @@ -315,7 +322,8 @@ void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_ void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc); -void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp); +void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp, + bool nx_huge_page_possible); void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp); #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index f5958071220c..e450f49f2225 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -713,9 +713,9 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, continue; link_shadow_page(vcpu, it.sptep, sp); - if (fault->huge_page_disallowed && - fault->req_level >= it.level) - account_huge_nx_page(vcpu->kvm, sp); + if (fault->huge_page_disallowed) + account_huge_nx_page(vcpu->kvm, sp, + fault->req_level >= it.level); } if (WARN_ON_ONCE(it.level != fault->goal_level)) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index bf2ccf9debca..903d0d3497b6 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -284,6 +284,8 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm_vcpu *vcpu) static void tdp_mmu_init_sp(struct kvm_mmu_page *sp, tdp_ptep_t sptep, gfn_t gfn, union kvm_mmu_page_role role) { + INIT_LIST_HEAD(&sp->lpage_disallowed_link); + set_page_private(virt_to_page(sp->spt), (unsigned long)sp); sp->role = role; @@ -1130,7 +1132,7 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter, spin_lock(&kvm->arch.tdp_mmu_pages_lock); list_add(&sp->link, &kvm->arch.tdp_mmu_pages); if (account_nx) - account_huge_nx_page(kvm, sp); + account_huge_nx_page(kvm, sp, true); spin_unlock(&kvm->arch.tdp_mmu_pages_lock); return 0; -- 2.37.1.559.g78731f0fdb-goog