From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.0 required=3.0 tests=BAYES_00,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DED2CC433ED for ; Mon, 12 Apr 2021 10:27:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B192B6134F for ; Mon, 12 Apr 2021 10:27:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238183AbhDLK1k (ORCPT ); Mon, 12 Apr 2021 06:27:40 -0400 Received: from mail.kernel.org ([198.145.29.99]:46252 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237753AbhDLK1Z (ORCPT ); Mon, 12 Apr 2021 06:27:25 -0400 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id C93BB6134F; Mon, 12 Apr 2021 10:27:07 +0000 (UTC) Received: from 78.163-31-62.static.virginmediabusiness.co.uk ([62.31.163.78] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94) (envelope-from ) id 1lVtmb-006yxo-LA; Mon, 12 Apr 2021 11:27:05 +0100 Date: Mon, 12 Apr 2021 11:27:04 +0100 Message-ID: <87czuzol1j.wl-maz@kernel.org> From: Marc Zyngier To: Paolo Bonzini Cc: Sean Christopherson , Huacai Chen , Aleksandar Markovic , Paul Mackerras , James Morse , Julien Thierry , Suzuki K Poulose , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gardon Subject: Re: [PATCH v2 00/10] KVM: Consolidate and optimize MMU notifiers In-Reply-To: <9376b453-be3a-f8b7-d53a-7e54c25161ce@redhat.com> References: <20210402005658.3024832-1-seanjc@google.com> <9376b453-be3a-f8b7-d53a-7e54c25161ce@redhat.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 62.31.163.78 X-SA-Exim-Rcpt-To: pbonzini@redhat.com, seanjc@google.com, chenhuacai@kernel.org, aleksandar.qemu.devel@gmail.com, paulus@ozlabs.org, james.morse@arm.com, julien.thierry.kdev@gmail.com, suzuki.poulose@arm.com, vkuznets@redhat.com, wanpengli@tencent.com, jmattson@google.com, joro@8bytes.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-kernel@vger.kernel.org, bgardon@google.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Fri, 02 Apr 2021 13:17:45 +0100, Paolo Bonzini wrote: > > On 02/04/21 02:56, Sean Christopherson wrote: > > The end goal of this series is to optimize the MMU notifiers to take > > mmu_lock if and only if the notification is relevant to KVM, i.e. the hva > > range overlaps a memslot. Large VMs (hundreds of vCPUs) are very > > sensitive to mmu_lock being taken for write at inopportune times, and > > such VMs also tend to be "static", e.g. backed by HugeTLB with minimal > > page shenanigans. The vast majority of notifications for these VMs will > > be spurious (for KVM), and eliding mmu_lock for spurious notifications > > avoids an otherwise unacceptable disruption to the guest. > > > > To get there without potentially degrading performance, e.g. due to > > multiple memslot lookups, especially on non-x86 where the use cases are > > largely unknown (from my perspective), first consolidate the MMU notifier > > logic by moving the hva->gfn lookups into common KVM. > > > > Based on kvm/queue, commit 5f986f748438 ("KVM: x86: dump_vmcs should > > include the autoload/autostore MSR lists"). > > > > Well tested on Intel and AMD. Compile tested for arm64, MIPS, PPC, > > PPC e500, and s390. Absolutely needs to be tested for real on non-x86, > > I give it even odds that I introduced an off-by-one bug somewhere. > > > > v2: > > - Drop the patches that have already been pushed to kvm/queue. > > - Drop two selftest changes that had snuck in via "git commit -a". > > - Add a patch to assert that mmu_notifier_count is elevated when > > .change_pte() runs. [Paolo] > > - Split out moving KVM_MMU_(UN)LOCK() to __kvm_handle_hva_range() to a > > separate patch. Opted not to squash it with the introduction of the > > common hva walkers (patch 02), as that prevented sharing code between > > the old and new APIs. [Paolo] > > - Tweak the comment in kvm_vm_destroy() above the smashing of the new > > slots lock. [Paolo] > > - Make mmu_notifier_slots_lock unconditional to avoid #ifdefs. [Paolo] > > > > v1: > > - https://lkml.kernel.org/r/20210326021957.1424875-1-seanjc@google.com > > > > Sean Christopherson (10): > > KVM: Assert that notifier count is elevated in .change_pte() > > KVM: Move x86's MMU notifier memslot walkers to generic code > > KVM: arm64: Convert to the gfn-based MMU notifier callbacks > > KVM: MIPS/MMU: Convert to the gfn-based MMU notifier callbacks > > KVM: PPC: Convert to the gfn-based MMU notifier callbacks > > KVM: Kill off the old hva-based MMU notifier callbacks > > KVM: Move MMU notifier's mmu_lock acquisition into common helper > > KVM: Take mmu_lock when handling MMU notifier iff the hva hits a > > memslot > > KVM: Don't take mmu_lock for range invalidation unless necessary > > KVM: x86/mmu: Allow yielding during MMU notifier unmap/zap, if > > possible > > > > arch/arm64/kvm/mmu.c | 117 +++------ > > arch/mips/kvm/mmu.c | 97 ++------ > > arch/powerpc/include/asm/kvm_book3s.h | 12 +- > > arch/powerpc/include/asm/kvm_ppc.h | 9 +- > > arch/powerpc/kvm/book3s.c | 18 +- > > arch/powerpc/kvm/book3s.h | 10 +- > > arch/powerpc/kvm/book3s_64_mmu_hv.c | 98 ++------ > > arch/powerpc/kvm/book3s_64_mmu_radix.c | 25 +- > > arch/powerpc/kvm/book3s_hv.c | 12 +- > > arch/powerpc/kvm/book3s_pr.c | 56 ++--- > > arch/powerpc/kvm/e500_mmu_host.c | 27 +- > > arch/x86/kvm/mmu/mmu.c | 127 ++++------ > > arch/x86/kvm/mmu/tdp_mmu.c | 245 +++++++------------ > > arch/x86/kvm/mmu/tdp_mmu.h | 14 +- > > include/linux/kvm_host.h | 22 +- > > virt/kvm/kvm_main.c | 325 +++++++++++++++++++------ > > 16 files changed, 552 insertions(+), 662 deletions(-) > > > > For MIPS, I am going to post a series that simplifies TLB flushing > further. I applied it, and rebased this one on top, to > kvm/mmu-notifier-queue. > > Architecture maintainers, please look at the branch and > review/test/ack your parts. I've given this a reasonably good beating on arm64 for both VHE and nVHE HW, and nothing caught fire, although I was left with a conflict in the x86 code after merging with linux/master. Feel free to add a Tested-by: Marc Zyngier for the arm64 side. M. -- Without deviation from the norm, progress is not possible.