From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D29BEC433E0 for ; Tue, 12 Jan 2021 18:13:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 81FDB2311F for ; Tue, 12 Jan 2021 18:13:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406497AbhALSN1 (ORCPT ); Tue, 12 Jan 2021 13:13:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56978 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2406264AbhALSMs (ORCPT ); Tue, 12 Jan 2021 13:12:48 -0500 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 96F1EC061381 for ; Tue, 12 Jan 2021 10:11:05 -0800 (PST) Received: by mail-pj1-x1049.google.com with SMTP id hg11so2039446pjb.2 for ; Tue, 12 Jan 2021 10:11:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=I5IsJbZuDYpFXFlSvTXHYskF+HsPl8YK9J43vEHOECY=; b=VwoMdWCCamn7/LtEg6RBTX9kpX7d/BuhrxswvYO+plPNfvLxvCdj4KPIAao+O4NmZw 1rKgoW1SN3im8LUW8D00utBmeB22a/IjCJUsnuXeQsQV9ILuWzAHu2FzJg3C1u3OVUJi 8uk4NDD7B97v+1HE8o6mK6k7Q8qxq1Rx1xqN47NviyQBfQS3uV1EoApFoWnCuw5J00G/ UXdNPK9vyqbfIga9pFlfir7gowsNSg1hmVCoBBvpSQ5QvdPoKNQbYAlKJZ0kNQ3nAXqs 7phOBNpnYycGrtPDdrXUKWCp1Ie6JT3yyfVnPWh2ihGMA8EPdtV4Z/not+b9BKhh3xjO liSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=I5IsJbZuDYpFXFlSvTXHYskF+HsPl8YK9J43vEHOECY=; b=YYxVLA99y2L5i5MDhFrCizgZIAjd2GjD6UQBIg+jgXWVbQ+1t2UZGPPYo4djDEKnto PJdAZ4L53cIQXVW75tvMAAX8C9HVVFONXB0hkWqzSilHneBUerCZ1cKia/5ENMnpMpVu jSeF/xpfMNWxY0AxcLD7KMkyleJqDmvLo7RZwzX29hS47Jv75pxTpOXg3N8QJmjouZr4 Ef5HRki+oiqR91OV+hjMw4OB36/21U7/fcSFVCCvzpXNzZ2CqKB2lgWoC+yCHfV6nyzE OxyjAAGatdD4+LddlikLoMvEQ3U7DswTgOlRv4zew8SFpvjEkOiliMqMsIx6rg8N8FV1 T+Qg== X-Gm-Message-State: AOAM533x1S8uSkaGdqC+yIjKncWgzQWjD1wPBGXDClYEjUEbygmhEh7A f3s20kSsvDhLFKABJVIfw/Tb4Y0RuB7byibidjQ1tRnT4W1JR9fMZGF9osDQYxOW7ZnY8HsYHD/ UtgVhyD8ibIDGwLY4niUyKqCDRLnPUexOhaSXBDpUQnkY+3gWlM/las8WPQ8Kbz7s9APZahO/ X-Google-Smtp-Source: ABdhPJxbA3iapiR/f9u0K/8sUL1gu4aKEfntqg1SY+Pngz3C29pgj1likg4eP52VuPDdA1tup5WUnbUJi1kM Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a17:90a:ff0d:: with SMTP id ce13mr326617pjb.109.1610475064980; Tue, 12 Jan 2021 10:11:04 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:28 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-12-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 11/24] kvm: x86/mmu: Put TDP MMU PT walks in RCU read-critical section From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org In order to enable concurrent modifications to the paging structures in the TDP MMU, threads must be able to safely remove pages of page table memory while other threads are traversing the same memory. To ensure threads do not access PT memory after it is freed, protect PT memory with RCU. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 53 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 51 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index e8f35cd46b4c..662907d374b3 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -458,11 +458,14 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm, * Return true if this function yielded, the TLBs were flushed, and the * iterator's traversal was reset. Return false if a yield was not needed. */ -static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *iter) +static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, + struct tdp_iter *iter) { if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { kvm_flush_remote_tlbs(kvm); + rcu_read_unlock(); cond_resched_lock(&kvm->mmu_lock); + rcu_read_lock(); tdp_iter_refresh_walk(iter); return true; } else @@ -483,7 +486,9 @@ static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *it static bool tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter) { if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { + rcu_read_unlock(); cond_resched_lock(&kvm->mmu_lock); + rcu_read_lock(); tdp_iter_refresh_walk(iter); return true; } else @@ -508,6 +513,8 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, gfn_t last_goal_gfn = start; bool flush_needed = false; + rcu_read_lock(); + tdp_root_for_each_pte(iter, root, start, end) { /* Ensure forward progress has been made before yielding. */ if (can_yield && iter.goal_gfn != last_goal_gfn && @@ -538,6 +545,8 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte(kvm, &iter, 0); flush_needed = true; } + + rcu_read_unlock(); return flush_needed; } @@ -650,6 +659,9 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, huge_page_disallowed, &req_level); trace_kvm_mmu_spte_requested(gpa, level, pfn); + + rcu_read_lock(); + tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) { if (nx_huge_page_workaround_enabled) disallowed_hugepage_adjust(iter.old_spte, gfn, @@ -693,11 +705,14 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, } } - if (WARN_ON(iter.level != level)) + if (WARN_ON(iter.level != level)) { + rcu_read_unlock(); return RET_PF_RETRY; + } ret = tdp_mmu_map_handle_target_level(vcpu, write, map_writable, &iter, pfn, prefault); + rcu_read_unlock(); return ret; } @@ -768,6 +783,8 @@ static int age_gfn_range(struct kvm *kvm, struct kvm_memory_slot *slot, int young = 0; u64 new_spte = 0; + rcu_read_lock(); + tdp_root_for_each_leaf_pte(iter, root, start, end) { /* * If we have a non-accessed entry we don't need to change the @@ -799,6 +816,8 @@ static int age_gfn_range(struct kvm *kvm, struct kvm_memory_slot *slot, trace_kvm_age_page(iter.gfn, iter.level, slot, young); } + rcu_read_unlock(); + return young; } @@ -844,6 +863,8 @@ static int set_tdp_spte(struct kvm *kvm, struct kvm_memory_slot *slot, u64 new_spte; int need_flush = 0; + rcu_read_lock(); + WARN_ON(pte_huge(*ptep)); new_pfn = pte_pfn(*ptep); @@ -872,6 +893,8 @@ static int set_tdp_spte(struct kvm *kvm, struct kvm_memory_slot *slot, if (need_flush) kvm_flush_remote_tlbs_with_address(kvm, gfn, 1); + rcu_read_unlock(); + return 0; } @@ -896,6 +919,8 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, gfn_t last_goal_gfn = start; bool spte_set = false; + rcu_read_lock(); + BUG_ON(min_level > KVM_MAX_HUGEPAGE_LEVEL); for_each_tdp_pte_min_level(iter, root->spt, root->role.level, @@ -924,6 +949,8 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); spte_set = true; } + + rcu_read_unlock(); return spte_set; } @@ -966,6 +993,8 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, gfn_t last_goal_gfn = start; bool spte_set = false; + rcu_read_lock(); + tdp_root_for_each_leaf_pte(iter, root, start, end) { /* Ensure forward progress has been made before yielding. */ if (iter.goal_gfn != last_goal_gfn && @@ -994,6 +1023,8 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); spte_set = true; } + + rcu_read_unlock(); return spte_set; } @@ -1035,6 +1066,8 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root, struct tdp_iter iter; u64 new_spte; + rcu_read_lock(); + tdp_root_for_each_leaf_pte(iter, root, gfn + __ffs(mask), gfn + BITS_PER_LONG) { if (!mask) @@ -1060,6 +1093,8 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root, mask &= ~(1UL << (iter.gfn - gfn)); } + + rcu_read_unlock(); } /* @@ -1100,6 +1135,8 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, gfn_t last_goal_gfn = start; bool spte_set = false; + rcu_read_lock(); + tdp_root_for_each_pte(iter, root, start, end) { /* Ensure forward progress has been made before yielding. */ if (iter.goal_gfn != last_goal_gfn && @@ -1125,6 +1162,7 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, spte_set = true; } + rcu_read_unlock(); return spte_set; } @@ -1163,6 +1201,8 @@ static void zap_collapsible_spte_range(struct kvm *kvm, gfn_t last_goal_gfn = start; bool spte_set = false; + rcu_read_lock(); + tdp_root_for_each_pte(iter, root, start, end) { /* Ensure forward progress has been made before yielding. */ if (iter.goal_gfn != last_goal_gfn && @@ -1190,6 +1230,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm, spte_set = true; } + rcu_read_unlock(); if (spte_set) kvm_flush_remote_tlbs(kvm); } @@ -1226,6 +1267,8 @@ static bool write_protect_gfn(struct kvm *kvm, struct kvm_mmu_page *root, u64 new_spte; bool spte_set = false; + rcu_read_lock(); + tdp_root_for_each_leaf_pte(iter, root, gfn, gfn + 1) { if (!is_writable_pte(iter.old_spte)) break; @@ -1237,6 +1280,8 @@ static bool write_protect_gfn(struct kvm *kvm, struct kvm_mmu_page *root, spte_set = true; } + rcu_read_unlock(); + return spte_set; } @@ -1277,10 +1322,14 @@ int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, *root_level = vcpu->arch.mmu->shadow_root_level; + rcu_read_lock(); + tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) { leaf = iter.level; sptes[leaf] = iter.old_spte; } + rcu_read_unlock(); + return leaf; } -- 2.30.0.284.gd98b1dd5eaa7-goog