From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95D84C6FA82 for ; Thu, 22 Sep 2022 15:37:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231407AbiIVPhO (ORCPT ); Thu, 22 Sep 2022 11:37:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44628 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230213AbiIVPhN (ORCPT ); Thu, 22 Sep 2022 11:37:13 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED535EB107 for ; Thu, 22 Sep 2022 08:37:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1663861031; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Mpy+ZkaV8qU9h3mLap4FWRmJ5KwCWQIoj50k/FL3Cgo=; b=Lv1LLcLydKz7NkweFgoflWnqBtQdi2c9acXvatThQ3CgcUWPRqyf4yNPI8g2xUmFhl+UBW FKd00XYCLqCeWD6S0cR5FENg6/ONnZoyBEE74ngGrOs7yIvDNXBxOLysiFJ3VXV8FkHiJz Kjox5N4+nMm9ISqSzvAx9t/9hhzd8Fg= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-516-2yJzx0JvP5WPKKqni_XbzA-1; Thu, 22 Sep 2022 11:37:08 -0400 X-MC-Unique: 2yJzx0JvP5WPKKqni_XbzA-1 Received: by mail-wr1-f70.google.com with SMTP id g19-20020adfa493000000b0022a2ee64216so3405335wrb.14 for ; Thu, 22 Sep 2022 08:37:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:message-id:date:references:in-reply-to:subject:cc:to :from:x-gm-message-state:from:to:cc:subject:date; bh=Mpy+ZkaV8qU9h3mLap4FWRmJ5KwCWQIoj50k/FL3Cgo=; b=JSnIT06QzCWz8gxxqoFmAMUiR1XitbFK4QuVfo64CAx4R6B33ikzFlUpfZzrf6O+Yb 6cwFvDHl5FP2dbOBk4ApyAJFDoKOK5Iow4f462qHbpRTGi4L3jMRY2QO0aeXZvjYRUEH cv/EfpT905fEdGzNOgts/OLvEpMT3sTe5kAWVtOKuG3L3TYYbo989BmyysKUlz7xdzIk WJ3PPap8Ddl7yR0KsRS01q8C7OfwfgqbItCIXEoCXPPvtwT8E2lsBa57aId0iPDK8u88 fIgSMgbTqWJh8yGJKMQA/fB/ccg+qLLIrc3ywtA+IPTFDS7cVAYi2yF3RRrsqw8M56n3 ErKg== X-Gm-Message-State: ACrzQf36P6H52p4HU7UYZ/i0A1iMG5FDzygdrgs6gUyMhUBTodlPdri6 VYj5oZ4PY5ikYB9E2CGCd/bhbyrqSlzMs8t/93zmbgBARGTFL8gL9j8saHggWMVb01IhUX2NCGe DDEPo3cuMcbJyp/d6iTmv8uSY X-Received: by 2002:a05:600c:1f18:b0:3b4:c4ae:f666 with SMTP id bd24-20020a05600c1f1800b003b4c4aef666mr9900246wmb.88.1663861027488; Thu, 22 Sep 2022 08:37:07 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7+bdE8DyU/yd2jYZNWXi0hgLkeGHOQ6+wdsC6U/KYXsesIAen2WzmugHaX3m5/Qn82o8z/QA== X-Received: by 2002:a05:600c:1f18:b0:3b4:c4ae:f666 with SMTP id bd24-20020a05600c1f1800b003b4c4aef666mr9900223wmb.88.1663861027244; Thu, 22 Sep 2022 08:37:07 -0700 (PDT) Received: from fedora (nat-2.ign.cz. [91.219.240.2]) by smtp.gmail.com with ESMTPSA id z5-20020a5d6405000000b0022af9555669sm6245246wru.99.2022.09.22.08.37.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Sep 2022 08:37:06 -0700 (PDT) From: Vitaly Kuznetsov To: Sean Christopherson Cc: kvm@vger.kernel.org, Paolo Bonzini , Wanpeng Li , Jim Mattson , Michael Kelley , Siddharth Chandrasekaran , Yuan Yao , Maxim Levitsky , linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v10 02/39] KVM: x86: hyper-v: Resurrect dedicated KVM_REQ_HV_TLB_FLUSH flag In-Reply-To: References: <20220921152436.3673454-1-vkuznets@redhat.com> <20220921152436.3673454-3-vkuznets@redhat.com> <877d1voiuz.fsf@redhat.com> Date: Thu, 22 Sep 2022 17:37:05 +0200 Message-ID: <87sfkjmndq.fsf@redhat.com> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-hyperv@vger.kernel.org Sean Christopherson writes: > On Thu, Sep 22, 2022, Vitaly Kuznetsov wrote: >> Now let's get to VMX and the point of my confusion (and thanks in >> advance for educating me!): >> AFAIU, when EPT is in use: >> KVM_REQ_TLB_FLUSH_CURRENT == invept >> KVM_REQ_TLB_FLUSH_GUEST = invvpid >> >> For "normal" mappings (which are mapped on both stages) this is the same >> thing as they're 'tagged' with both VPID and 'EPT root'. The question is >> what's left. Given your comment, do I understand correctly that in case >> of an invalid mapping in the guest (GVA doesn't resolve to a GPA), this >> will only be tagged with VPID but not with 'EPT root' (as the CPU never >> reached to the second translation stage)? We certainly can't ignore >> these. Another (probably pure theoretical question) is what are the >> mappings which are tagged with 'EPT root' but don't have a VPID tag? > > Intel puts mappings into three categories, which for non-root mode equates to: > > linear == GVA => GPA > guest-physical == GPA => HPA > combined == GVA => HPA > > and essentially the categories that consume the GVA are tagged with the VPID > (linear and combined), and categories that consume the GPA are tagged with the > EPTP address (guest-physical and combined). > >> Are these the mapping which happen when e.g. vCPU has paging disabled? > > No, these mappings can be created at all times. Even with CR0.PG=1, the guest > can generate GPAs without going through a GVA=>GPA translation, e.g. the page tables > themselves, RTIT (Intel PT) addresses, etc... And even for combined/full > translations, the CPU can insert TLB entries for just the GPA=>HPA part. > > E.g. when a page is allocated by/for userspace, the kernel will zero the page using > the kernel's direct map, but userspace will access the page via a different GVA. > I.e. the guest effectively aliases GPA(x) with GVA(k) and GVA(u). By inserting > the GPA(x) => HPA(y) into the TLB, when guest userspace access GVA(u), the CPU > encounters a TLB miss on GVA(u) => GPA(x), but gets a TLB hit on GPA(x) => HPA(y). > > Separating EPT flushes from VPID (and PCID) flushes allows the CPU to retain > the partial TLB entries, e.g. a host change in the EPT tables will result in the > guest-physical and combined mappings being invalidated, but linear mappings can > be kept. > Thanks a bunch! For some reason I though it's always the full thing (combined) which is tagged with both VPID/PCID and EPTP and linear/guest-physical are just 'corner' cases (but are still combined and tagged). Apparently, it's not like that. > I'm 99% certain AMD also caches partial entries, e.g. see the blurb on INVLPGA > not affecting NPT translations, AMD just doesn't provide a way for the host to > flush _only_ NPT translations. Maybe the performance benefits weren't significant > enough to justify the extra complexity? > >> These are probably unrelated to Hyper-V TLB flushing. >> >> To preserve the 'small' optimization, we can probably move >> kvm_clear_request(KVM_REQ_HV_TLB_FLUSH, vcpu); >> >> to nested_svm_transition_tlb_flush() or, in case this sounds too >> hackish > > Move it to svm_flush_tlb_current(), because the justification is that on SVM, > flushing "current" TLB entries also flushes "guest" TLB entries due to the more > coarse-grained ASID-based TLB flush. E.g. > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index dd599afc85f5..a86b41503723 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -3737,6 +3737,13 @@ static void svm_flush_tlb_current(struct kvm_vcpu *vcpu) > { > struct vcpu_svm *svm = to_svm(vcpu); > > + /* > + * Unlike VMX, SVM doesn't provide a way to flush only NPT TLB entries. > + * A TLB flush for the current ASID flushes both "host" and "guest" TLB > + * entries, and thus is a superset of Hyper-V's fine grained flushing. > + */ > + kvm_hv_vcpu_purge_flush_tlb(vcpu); > + > /* > * Flush only the current ASID even if the TLB flush was invoked via > * kvm_flush_remote_tlbs(). Although flushing remote TLBs requires all > >> we can drop it for now and add it to the (already overfull) >> bucket of the "optimize nested_svm_transition_tlb_flush()". > > I think even long term, purging Hyper-V's FIFO in svm_flush_tlb_current() is the > correct/desired behavior. This doesn't really have anything to do with nSVM, > it's all about SVM not providing a way to flush only NPT entries. True that, silly me forgot that even without any nesting, Hyper-V TLB flush after svm_flush_tlb_current() makes no sense. > -- Vitaly