From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA814C433E2 for ; Wed, 31 Mar 2021 07:58:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9D345619D3 for ; Wed, 31 Mar 2021 07:58:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234254AbhCaH5m (ORCPT ); Wed, 31 Mar 2021 03:57:42 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:33802 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234173AbhCaH5Y (ORCPT ); Wed, 31 Mar 2021 03:57:24 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1617177444; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QzUhMnMU3zQi7kbLUex8emke2jBdKMAGI8Tet8F4dtw=; b=PICz9S+uBtGk/mte47jDU1AGrkdflPKcD5bU88EEagj46Oqn1uIHDo/bIGCl0/x62ltz4e NzPRIUwYEUu16rBRMCTz5vVNLPe0CGgBV1F7pjtpj10uBlurF3ClfpS7Il8yt2yAXV0gqB PjSHJgEzf/9ugPVOYl1jllwv0YcX9co= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-257-v62Gzx7aPmOH7RHcEjALkg-1; Wed, 31 Mar 2021 03:57:22 -0400 X-MC-Unique: v62Gzx7aPmOH7RHcEjALkg-1 Received: by mail-wr1-f71.google.com with SMTP id i5so568573wrp.8 for ; Wed, 31 Mar 2021 00:57:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=QzUhMnMU3zQi7kbLUex8emke2jBdKMAGI8Tet8F4dtw=; b=uiMnZmQcsDmsR6bHwKp0xLBgniGeIztDwmk/MIfCzxLuYiNVKJ7/4j7rjy/FxUiW5n EUS+rfqhZHFtLam2wrnUZr91L4WNVraD5SXWvACxUqsz+gpArrser7p0vp5wY38nLwuQ vIy5IO2yzmSeakf/FBS0zBHwS0Ml4SRXD6NqXpj13b455MaXK50KwjEw2e3Nz0lCdDCC A4AY4iOqpH7Nf4PmiZ/RHIjaRytpnXiGNTyAdcvBqFcsde5heSXSItgJebz+H9FN5JVM QqGiaGkCSblYBNBYnLYU+huPGE9VQNemjMfMKxlDlyEGxAM3la3CcnDKlrY1gaq9k+Ls 6y6g== X-Gm-Message-State: AOAM530Ip7XAEQEH6Q1znQbRLSC4+v8XVmxQvZljhuZE1bl/UwDeTuOA /q+BIB9dwWdtncZHM1CEzR4dMoOO41qbwYiEhfxlSxMtiHlzWYWyd5TdwhjbQriFaMIyPnNaOI3 xAdYjeI6/A+0N1+bO+33u7jeG X-Received: by 2002:a05:600c:190a:: with SMTP id j10mr1935783wmq.140.1617177441042; Wed, 31 Mar 2021 00:57:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyg+EcdGfruRX3HrpM78c15MOqjXD+Pq6QPq9lrCE+nNtYrHVJVg0glnjf7PSn4UemOs7R8Kw== X-Received: by 2002:a05:600c:190a:: with SMTP id j10mr1935760wmq.140.1617177440807; Wed, 31 Mar 2021 00:57:20 -0700 (PDT) Received: from [192.168.10.118] ([93.56.169.140]) by smtp.gmail.com with ESMTPSA id b65sm2631515wmh.4.2021.03.31.00.57.18 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 31 Mar 2021 00:57:19 -0700 (PDT) Subject: Re: [PATCH 00/18] KVM: Consolidate and optimize MMU notifiers To: Sean Christopherson , Marc Zyngier , Huacai Chen , Aleksandar Markovic , Paul Mackerras Cc: James Morse , Julien Thierry , Suzuki K Poulose , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gardon References: <20210326021957.1424875-1-seanjc@google.com> From: Paolo Bonzini Message-ID: Date: Wed, 31 Mar 2021 09:57:17 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 MIME-Version: 1.0 In-Reply-To: <20210326021957.1424875-1-seanjc@google.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 26/03/21 03:19, Sean Christopherson wrote: > The end goal of this series is to optimize the MMU notifiers to take > mmu_lock if and only if the notification is relevant to KVM, i.e. the hva > range overlaps a memslot. Large VMs (hundreds of vCPUs) are very > sensitive to mmu_lock being taken for write at inopportune times, and > such VMs also tend to be "static", e.g. backed by HugeTLB with minimal > page shenanigans. The vast majority of notifications for these VMs will > be spurious (for KVM), and eliding mmu_lock for spurious notifications > avoids an otherwise unacceptable disruption to the guest. > > To get there without potentially degrading performance, e.g. due to > multiple memslot lookups, especially on non-x86 where the use cases are > largely unknown (from my perspective), first consolidate the MMU notifier > logic by moving the hva->gfn lookups into common KVM. > > Applies on my TDP MMU TLB flushing bug fixes[*], which conflict horribly > with the TDP MMU changes in this series. That code applies on kvm/queue > (commit 4a98623d5d90, "KVM: x86/mmu: Mark the PAE roots as decrypted for > shadow paging"). > > Speaking of conflicts, Ben will soon be posting a series to convert a > bunch of TDP MMU flows to take mmu_lock only for read. Presumably there > will be an absurd number of conflicts; Ben and I will sort out the > conflicts in whichever series loses the race. > > Well tested on Intel and AMD. Compile tested for arm64, MIPS, PPC, > PPC e500, and s390. Absolutely needs to be tested for real on non-x86, > I give it even odds that I introduced an off-by-one bug somewhere. > > [*] https://lkml.kernel.org/r/20210325200119.1359384-1-seanjc@google.com > > > Patches 1-7 are x86 specific prep patches to play nice with moving > the hva->gfn memslot lookups into common code. There ended up being waaay > more of these than I expected/wanted, but I had a hell of a time getting > the flushing logic right when shuffling the memslot and address space > loops. In the end, I was more confident I got things correct by batching > the flushes. > > Patch 8 moves the existing API prototypes into common code. It could > technically be dropped since the old APIs are gone in the end, but I > thought the switch to the new APIs would suck a bit less this way. > > Patch 9 moves arm64's MMU notifier tracepoints into common code so that > they are not lost when arm64 is converted to the new APIs, and so that all > architectures can benefit. > > Patch 10 moves x86's memslot walkers into common KVM. I chose x86 purely > because I could actually test it. All architectures use nearly identical > code, so I don't think it actually matters in the end. > > Patches 11-13 move arm64, MIPS, and PPC to the new APIs. > > Patch 14 yanks out the old APIs. > > Patch 15 adds the mmu_lock elision, but only for unpaired notifications. > > Patch 16 adds mmu_lock elision for paired .invalidate_range_{start,end}(). > This is quite nasty and no small part of me thinks the patch should be > burned with fire (I won't spoil it any further), but it's also the most > problematic scenario for our particular use case. :-/ > > Patches 17-18 are additional x86 cleanups. Queued and 1-9 and 18, thanks. There's a small issue in patch 10 that prevented me from committing 10-15, but they mostly look good. Paolo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B30C9C433DB for ; Wed, 31 Mar 2021 07:57:30 +0000 (UTC) Received: from mm01.cs.columbia.edu (mm01.cs.columbia.edu [128.59.11.253]) by mail.kernel.org (Postfix) with ESMTP id 045AF619BD for ; Wed, 31 Mar 2021 07:57:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 045AF619BD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvmarm-bounces@lists.cs.columbia.edu Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 6AA0B4B373; Wed, 31 Mar 2021 03:57:29 -0400 (EDT) X-Virus-Scanned: at lists.cs.columbia.edu Authentication-Results: mm01.cs.columbia.edu (amavisd-new); dkim=softfail (fail, message has been altered) header.i=@redhat.com Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IeoeoGbQgOK8; Wed, 31 Mar 2021 03:57:28 -0400 (EDT) Received: from mm01.cs.columbia.edu (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 3275F4B3CE; Wed, 31 Mar 2021 03:57:28 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 4FF664B3A1 for ; Wed, 31 Mar 2021 03:57:27 -0400 (EDT) X-Virus-Scanned: at lists.cs.columbia.edu Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EV3yj5+NWQbo for ; Wed, 31 Mar 2021 03:57:26 -0400 (EDT) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 44EAB4B373 for ; Wed, 31 Mar 2021 03:57:26 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1617177446; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QzUhMnMU3zQi7kbLUex8emke2jBdKMAGI8Tet8F4dtw=; b=Lv82hY6rJw7tyNyg2cK0p8g3vYcqXAbAE9lnQB4c4HBqTPVQ3B750bSkkwXeQa0nYG2S2k jahKJwB11U2KmgzO5hjmm3OdzDRMpVTgR59zXi4Q0tUwqqGJQqc2Y3RFAjZQOeLGDtKKoP chRnYogT1Vak/I/Nl/FcYA3CJ3S/V3g= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-27-3MjPIvV2PTaRLbsdPnPCsA-1; Wed, 31 Mar 2021 03:57:22 -0400 X-MC-Unique: 3MjPIvV2PTaRLbsdPnPCsA-1 Received: by mail-wr1-f69.google.com with SMTP id h5so562457wrr.17 for ; Wed, 31 Mar 2021 00:57:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=QzUhMnMU3zQi7kbLUex8emke2jBdKMAGI8Tet8F4dtw=; b=qP1Q9Ajsscb4IFO6/t4iLFqEP/46d+NZxccqlgWt5IYzxest8wFbsMJEvpWBP0ouUF n/kmA4iV2LV/2AkLPsq91TuBuTvSWtpfBUes0GuLZUE8pEnUeg5zoCxNr6t/TxiX7m9I NabgDQw7bfX4SgtZ/zimgKXUsoBlWdSFWJiT5aIsBdXZfYJVfCrnVtwwL6OMF5powuT4 BAtOdX0OMJ08OLqeeC7g/Ur5Ej81SYHjrC67FT6An4PCp0d5DAAkjouY15ZiJNA89vhG Vj20ZChUAfZYw+EoMrDoSbsH5xoygiYAV5oclYOBBamU19wgyghMYQ1JzeFsaq4zVw5b /cHA== X-Gm-Message-State: AOAM532u4RfTnQC9poOHpdbfB1iq+Fm4laM3qXnWvZ6b8L39ukilaML/ gkpmUwqetFZAh90SO13TUVnlDa/IEk2GmGKflpFas2XeXtxOYNYmqbtapfG4QELeVghya8xpw/D ZvB5Frfy2gUaYO1RfxFaGvDg8 X-Received: by 2002:a05:600c:190a:: with SMTP id j10mr1935797wmq.140.1617177441068; Wed, 31 Mar 2021 00:57:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyg+EcdGfruRX3HrpM78c15MOqjXD+Pq6QPq9lrCE+nNtYrHVJVg0glnjf7PSn4UemOs7R8Kw== X-Received: by 2002:a05:600c:190a:: with SMTP id j10mr1935760wmq.140.1617177440807; Wed, 31 Mar 2021 00:57:20 -0700 (PDT) Received: from [192.168.10.118] ([93.56.169.140]) by smtp.gmail.com with ESMTPSA id b65sm2631515wmh.4.2021.03.31.00.57.18 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 31 Mar 2021 00:57:19 -0700 (PDT) Subject: Re: [PATCH 00/18] KVM: Consolidate and optimize MMU notifiers To: Sean Christopherson , Marc Zyngier , Huacai Chen , Aleksandar Markovic , Paul Mackerras References: <20210326021957.1424875-1-seanjc@google.com> From: Paolo Bonzini Message-ID: Date: Wed, 31 Mar 2021 09:57:17 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 MIME-Version: 1.0 In-Reply-To: <20210326021957.1424875-1-seanjc@google.com> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=pbonzini@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Cc: Wanpeng Li , kvm@vger.kernel.org, Joerg Roedel , linux-mips@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Ben Gardon , Vitaly Kuznetsov , kvmarm@lists.cs.columbia.edu, Jim Mattson X-BeenThere: kvmarm@lists.cs.columbia.edu X-Mailman-Version: 2.1.14 Precedence: list List-Id: Where KVM/ARM decisions are made List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu On 26/03/21 03:19, Sean Christopherson wrote: > The end goal of this series is to optimize the MMU notifiers to take > mmu_lock if and only if the notification is relevant to KVM, i.e. the hva > range overlaps a memslot. Large VMs (hundreds of vCPUs) are very > sensitive to mmu_lock being taken for write at inopportune times, and > such VMs also tend to be "static", e.g. backed by HugeTLB with minimal > page shenanigans. The vast majority of notifications for these VMs will > be spurious (for KVM), and eliding mmu_lock for spurious notifications > avoids an otherwise unacceptable disruption to the guest. > > To get there without potentially degrading performance, e.g. due to > multiple memslot lookups, especially on non-x86 where the use cases are > largely unknown (from my perspective), first consolidate the MMU notifier > logic by moving the hva->gfn lookups into common KVM. > > Applies on my TDP MMU TLB flushing bug fixes[*], which conflict horribly > with the TDP MMU changes in this series. That code applies on kvm/queue > (commit 4a98623d5d90, "KVM: x86/mmu: Mark the PAE roots as decrypted for > shadow paging"). > > Speaking of conflicts, Ben will soon be posting a series to convert a > bunch of TDP MMU flows to take mmu_lock only for read. Presumably there > will be an absurd number of conflicts; Ben and I will sort out the > conflicts in whichever series loses the race. > > Well tested on Intel and AMD. Compile tested for arm64, MIPS, PPC, > PPC e500, and s390. Absolutely needs to be tested for real on non-x86, > I give it even odds that I introduced an off-by-one bug somewhere. > > [*] https://lkml.kernel.org/r/20210325200119.1359384-1-seanjc@google.com > > > Patches 1-7 are x86 specific prep patches to play nice with moving > the hva->gfn memslot lookups into common code. There ended up being waaay > more of these than I expected/wanted, but I had a hell of a time getting > the flushing logic right when shuffling the memslot and address space > loops. In the end, I was more confident I got things correct by batching > the flushes. > > Patch 8 moves the existing API prototypes into common code. It could > technically be dropped since the old APIs are gone in the end, but I > thought the switch to the new APIs would suck a bit less this way. > > Patch 9 moves arm64's MMU notifier tracepoints into common code so that > they are not lost when arm64 is converted to the new APIs, and so that all > architectures can benefit. > > Patch 10 moves x86's memslot walkers into common KVM. I chose x86 purely > because I could actually test it. All architectures use nearly identical > code, so I don't think it actually matters in the end. > > Patches 11-13 move arm64, MIPS, and PPC to the new APIs. > > Patch 14 yanks out the old APIs. > > Patch 15 adds the mmu_lock elision, but only for unpaired notifications. > > Patch 16 adds mmu_lock elision for paired .invalidate_range_{start,end}(). > This is quite nasty and no small part of me thinks the patch should be > burned with fire (I won't spoil it any further), but it's also the most > problematic scenario for our particular use case. :-/ > > Patches 17-18 are additional x86 cleanups. Queued and 1-9 and 18, thanks. There's a small issue in patch 10 that prevented me from committing 10-15, but they mostly look good. Paolo _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFC11C433DB for ; Wed, 31 Mar 2021 07:58:58 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6AAD8619C2 for ; Wed, 31 Mar 2021 07:58:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6AAD8619C2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Type: Content-Transfer-Encoding:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:Cc:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=XZIk4et7RZ7xF0j1l7YNMbVHKLg572cgAVmdaM4VwVw=; b=F6yArQhhe/SlnqR6ROj3Q9Vp8 3FBF8jT39EaiIkFgrr8vngXxZCiM7kgMiGn+svCKS0Hy8nRte3XeM0qNGPzOWk/DODO5stTtwUyLu YUASt6qtvZdBESYuSJXZM1YwVXVBIbWomh7Fb9ftGdoR94hY5vPd5fSkCyOkXt+98LZUs3Nooa9Js fMldjg01ynx86QZ2umTgyLTCESzvDbz3pZPhUq2DfIFv6yTAYDNzjmbVcOS9r8WKebazWRVeE2k5B wSQyFbkXFzZgk1aY/O6Hi7+iUVCTq8+huuucBhNJMkigeYwcZ+hsrLG0nRzGsLb5UOBI4dmDJdbO1 BTev7Kdmg==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lRVjH-005sNB-4a; Wed, 31 Mar 2021 07:57:31 +0000 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lRVjB-005sLw-Tz for linux-arm-kernel@lists.infradead.org; Wed, 31 Mar 2021 07:57:28 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1617177444; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QzUhMnMU3zQi7kbLUex8emke2jBdKMAGI8Tet8F4dtw=; b=PICz9S+uBtGk/mte47jDU1AGrkdflPKcD5bU88EEagj46Oqn1uIHDo/bIGCl0/x62ltz4e NzPRIUwYEUu16rBRMCTz5vVNLPe0CGgBV1F7pjtpj10uBlurF3ClfpS7Il8yt2yAXV0gqB PjSHJgEzf/9ugPVOYl1jllwv0YcX9co= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-435-2DMMWYj5PzyUVk06VD7tPg-1; Wed, 31 Mar 2021 03:57:22 -0400 X-MC-Unique: 2DMMWYj5PzyUVk06VD7tPg-1 Received: by mail-wm1-f69.google.com with SMTP id k132so400253wma.1 for ; Wed, 31 Mar 2021 00:57:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=QzUhMnMU3zQi7kbLUex8emke2jBdKMAGI8Tet8F4dtw=; b=PoMHfcHnpioEP+j9JhIGA3QJX3YfJ9T55R/erGj3AppJ6Rl0t8l54wRftegqmmoh0w LafTFneUwHOmI+cxCUMGTZML/fYXi8MylkiwZxpFITHbm28lD2W+zgoPAzaKoXjN7tCr Z+KfgO9eFlNnIM1iMyptO5ph4nsDtP5Ijtmqay5+N7BgtAhwFtS/aV6DontUEU5vikf2 3QfI8tf+AfqLCiiLJbYbR/+fel4spbB/a/SHyDRzSIsOgXRCTglLVFGfd0+M5By3dqsr MJ5/9+TdlQA6CM/pqhnZ8NeqSFmwMMHeiEidt1bBHV2s82IZoku9c7jaoCjMlmVdkFjw 9Phg== X-Gm-Message-State: AOAM5305aZ6XJCYHF82aIlhikNwXEzPuXE8KawK+LNwSIP/TxDdJVMDC FM5fNKBkylR7COt3FiUMwWAKX6reuOU3kO+4R1l/D6pMpybAveNU5O3ph6Ld6VPfrZMSQsM2713 ZMoff6fSCNFm1uzbMNL0BFGIJQ0RJ4w7f6U4= X-Received: by 2002:a05:600c:190a:: with SMTP id j10mr1935776wmq.140.1617177441040; Wed, 31 Mar 2021 00:57:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyg+EcdGfruRX3HrpM78c15MOqjXD+Pq6QPq9lrCE+nNtYrHVJVg0glnjf7PSn4UemOs7R8Kw== X-Received: by 2002:a05:600c:190a:: with SMTP id j10mr1935760wmq.140.1617177440807; Wed, 31 Mar 2021 00:57:20 -0700 (PDT) Received: from [192.168.10.118] ([93.56.169.140]) by smtp.gmail.com with ESMTPSA id b65sm2631515wmh.4.2021.03.31.00.57.18 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 31 Mar 2021 00:57:19 -0700 (PDT) Subject: Re: [PATCH 00/18] KVM: Consolidate and optimize MMU notifiers To: Sean Christopherson , Marc Zyngier , Huacai Chen , Aleksandar Markovic , Paul Mackerras Cc: James Morse , Julien Thierry , Suzuki K Poulose , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gardon References: <20210326021957.1424875-1-seanjc@google.com> From: Paolo Bonzini Message-ID: Date: Wed, 31 Mar 2021 09:57:17 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 MIME-Version: 1.0 In-Reply-To: <20210326021957.1424875-1-seanjc@google.com> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=pbonzini@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210331_085726_312383_384C1E46 X-CRM114-Status: GOOD ( 31.32 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 26/03/21 03:19, Sean Christopherson wrote: > The end goal of this series is to optimize the MMU notifiers to take > mmu_lock if and only if the notification is relevant to KVM, i.e. the hva > range overlaps a memslot. Large VMs (hundreds of vCPUs) are very > sensitive to mmu_lock being taken for write at inopportune times, and > such VMs also tend to be "static", e.g. backed by HugeTLB with minimal > page shenanigans. The vast majority of notifications for these VMs will > be spurious (for KVM), and eliding mmu_lock for spurious notifications > avoids an otherwise unacceptable disruption to the guest. > > To get there without potentially degrading performance, e.g. due to > multiple memslot lookups, especially on non-x86 where the use cases are > largely unknown (from my perspective), first consolidate the MMU notifier > logic by moving the hva->gfn lookups into common KVM. > > Applies on my TDP MMU TLB flushing bug fixes[*], which conflict horribly > with the TDP MMU changes in this series. That code applies on kvm/queue > (commit 4a98623d5d90, "KVM: x86/mmu: Mark the PAE roots as decrypted for > shadow paging"). > > Speaking of conflicts, Ben will soon be posting a series to convert a > bunch of TDP MMU flows to take mmu_lock only for read. Presumably there > will be an absurd number of conflicts; Ben and I will sort out the > conflicts in whichever series loses the race. > > Well tested on Intel and AMD. Compile tested for arm64, MIPS, PPC, > PPC e500, and s390. Absolutely needs to be tested for real on non-x86, > I give it even odds that I introduced an off-by-one bug somewhere. > > [*] https://lkml.kernel.org/r/20210325200119.1359384-1-seanjc@google.com > > > Patches 1-7 are x86 specific prep patches to play nice with moving > the hva->gfn memslot lookups into common code. There ended up being waaay > more of these than I expected/wanted, but I had a hell of a time getting > the flushing logic right when shuffling the memslot and address space > loops. In the end, I was more confident I got things correct by batching > the flushes. > > Patch 8 moves the existing API prototypes into common code. It could > technically be dropped since the old APIs are gone in the end, but I > thought the switch to the new APIs would suck a bit less this way. > > Patch 9 moves arm64's MMU notifier tracepoints into common code so that > they are not lost when arm64 is converted to the new APIs, and so that all > architectures can benefit. > > Patch 10 moves x86's memslot walkers into common KVM. I chose x86 purely > because I could actually test it. All architectures use nearly identical > code, so I don't think it actually matters in the end. > > Patches 11-13 move arm64, MIPS, and PPC to the new APIs. > > Patch 14 yanks out the old APIs. > > Patch 15 adds the mmu_lock elision, but only for unpaired notifications. > > Patch 16 adds mmu_lock elision for paired .invalidate_range_{start,end}(). > This is quite nasty and no small part of me thinks the patch should be > burned with fire (I won't spoil it any further), but it's also the most > problematic scenario for our particular use case. :-/ > > Patches 17-18 are additional x86 cleanups. Queued and 1-9 and 18, thanks. There's a small issue in patch 10 that prevented me from committing 10-15, but they mostly look good. Paolo _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paolo Bonzini Date: Wed, 31 Mar 2021 07:57:17 +0000 Subject: Re: [PATCH 00/18] KVM: Consolidate and optimize MMU notifiers Message-Id: List-Id: References: <20210326021957.1424875-1-seanjc@google.com> In-Reply-To: <20210326021957.1424875-1-seanjc@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Sean Christopherson , Marc Zyngier , Huacai Chen , Aleksandar Markovic , Paul Mackerras Cc: James Morse , Julien Thierry , Suzuki K Poulose , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gardon On 26/03/21 03:19, Sean Christopherson wrote: > The end goal of this series is to optimize the MMU notifiers to take > mmu_lock if and only if the notification is relevant to KVM, i.e. the hva > range overlaps a memslot. Large VMs (hundreds of vCPUs) are very > sensitive to mmu_lock being taken for write at inopportune times, and > such VMs also tend to be "static", e.g. backed by HugeTLB with minimal > page shenanigans. The vast majority of notifications for these VMs will > be spurious (for KVM), and eliding mmu_lock for spurious notifications > avoids an otherwise unacceptable disruption to the guest. > > To get there without potentially degrading performance, e.g. due to > multiple memslot lookups, especially on non-x86 where the use cases are > largely unknown (from my perspective), first consolidate the MMU notifier > logic by moving the hva->gfn lookups into common KVM. > > Applies on my TDP MMU TLB flushing bug fixes[*], which conflict horribly > with the TDP MMU changes in this series. That code applies on kvm/queue > (commit 4a98623d5d90, "KVM: x86/mmu: Mark the PAE roots as decrypted for > shadow paging"). > > Speaking of conflicts, Ben will soon be posting a series to convert a > bunch of TDP MMU flows to take mmu_lock only for read. Presumably there > will be an absurd number of conflicts; Ben and I will sort out the > conflicts in whichever series loses the race. > > Well tested on Intel and AMD. Compile tested for arm64, MIPS, PPC, > PPC e500, and s390. Absolutely needs to be tested for real on non-x86, > I give it even odds that I introduced an off-by-one bug somewhere. > > [*] https://lkml.kernel.org/r/20210325200119.1359384-1-seanjc@google.com > > > Patches 1-7 are x86 specific prep patches to play nice with moving > the hva->gfn memslot lookups into common code. There ended up being waaay > more of these than I expected/wanted, but I had a hell of a time getting > the flushing logic right when shuffling the memslot and address space > loops. In the end, I was more confident I got things correct by batching > the flushes. > > Patch 8 moves the existing API prototypes into common code. It could > technically be dropped since the old APIs are gone in the end, but I > thought the switch to the new APIs would suck a bit less this way. > > Patch 9 moves arm64's MMU notifier tracepoints into common code so that > they are not lost when arm64 is converted to the new APIs, and so that all > architectures can benefit. > > Patch 10 moves x86's memslot walkers into common KVM. I chose x86 purely > because I could actually test it. All architectures use nearly identical > code, so I don't think it actually matters in the end. > > Patches 11-13 move arm64, MIPS, and PPC to the new APIs. > > Patch 14 yanks out the old APIs. > > Patch 15 adds the mmu_lock elision, but only for unpaired notifications. > > Patch 16 adds mmu_lock elision for paired .invalidate_range_{start,end}(). > This is quite nasty and no small part of me thinks the patch should be > burned with fire (I won't spoil it any further), but it's also the most > problematic scenario for our particular use case. :-/ > > Patches 17-18 are additional x86 cleanups. Queued and 1-9 and 18, thanks. There's a small issue in patch 10 that prevented me from committing 10-15, but they mostly look good. Paolo