From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DAF9C433F5 for ; Thu, 9 Dec 2021 15:33:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239717AbhLIPhR (ORCPT ); Thu, 9 Dec 2021 10:37:17 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:48296 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235464AbhLIPhQ (ORCPT ); Thu, 9 Dec 2021 10:37:16 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1639064022; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5WWwR4PJoOWTgTV1k31XoxiwzWYOL/P05Y/4kMNAfu4=; b=EHFhlPati0wiwgjCTlbbcraKKU8moFKI0/6z1dINWDuDBBPpGDI4Yuq/SToYkbKXS8incS vItO51wfyNH/q3+rOGCVmblTss5ykEtY3I/UXMJ8K8N9qFyDhdmmWtEFLdSjxrfoCxYX9y dSqELlc5RWN0xqyDvpfeE8uH3fA246M= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-52-xmwQxUapO9OMD2mvA3EkhQ-1; Thu, 09 Dec 2021 10:33:41 -0500 X-MC-Unique: xmwQxUapO9OMD2mvA3EkhQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 74FD781430E; Thu, 9 Dec 2021 15:33:39 +0000 (UTC) Received: from starship (unknown [10.40.192.24]) by smtp.corp.redhat.com (Postfix) with ESMTP id E00CB19D9B; Thu, 9 Dec 2021 15:33:34 +0000 (UTC) Message-ID: <14c4784b5e2bfbe813964485b46052315b533744.camel@redhat.com> Subject: Re: [PATCH 3/6] KVM: SVM: fix AVIC race of host->guest IPI delivery vs AVIC inhibition From: Maxim Levitsky To: Sean Christopherson Cc: Paolo Bonzini , kvm@vger.kernel.org, "open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , Wanpeng Li , Dave Hansen , Joerg Roedel , "H. Peter Anvin" , Vitaly Kuznetsov , Borislav Petkov , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , Ingo Molnar , Thomas Gleixner , Jim Mattson Date: Thu, 09 Dec 2021 17:33:33 +0200 In-Reply-To: References: <20211209115440.394441-1-mlevitsk@redhat.com> <20211209115440.394441-4-mlevitsk@redhat.com> <4d723b07-e626-190d-63f4-fd0b5497dd9b@redhat.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.36.5 (3.36.5-2.fc32) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2021-12-09 at 15:27 +0000, Sean Christopherson wrote: > On Thu, Dec 09, 2021, Maxim Levitsky wrote: > > On Thu, 2021-12-09 at 15:11 +0100, Paolo Bonzini wrote: > > > On 12/9/21 12:54, Maxim Levitsky wrote: > > > > If svm_deliver_avic_intr is called just after the target vcpu's AVIC got > > > > inhibited, it might read a stale value of vcpu->arch.apicv_active > > > > which can lead to the target vCPU not noticing the interrupt. > > > > > > > > Signed-off-by: Maxim Levitsky > > > > --- > > > > arch/x86/kvm/svm/avic.c | 16 +++++++++++++--- > > > > 1 file changed, 13 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c > > > > index 859ad2dc50f1..8c1b934bfa9b 100644 > > > > --- a/arch/x86/kvm/svm/avic.c > > > > +++ b/arch/x86/kvm/svm/avic.c > > > > @@ -691,6 +691,15 @@ int svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec) > > > > * automatically process AVIC interrupts at VMRUN. > > > > */ > > > > if (vcpu->mode == IN_GUEST_MODE) { > > > > + > > > > + /* > > > > + * At this point we had read the vcpu->arch.apicv_active == true > > > > + * and the vcpu->mode == IN_GUEST_MODE. > > > > + * Since we have a memory barrier after setting IN_GUEST_MODE, > > > > + * it ensures that AVIC inhibition is complete and thus > > > > + * the target is really running with AVIC enabled. > > > > + */ > > > > + > > > > int cpu = READ_ONCE(vcpu->cpu); > > > > > > I don't think it's correct. The vCPU has apicv_active written (in > > > kvm_vcpu_update_apicv) before vcpu->mode. > > > > I thought that we have a full memory barrier just prior to setting IN_GUEST_MODE > > thus if I see vcpu->mode == IN_GUEST_MODE then I'll see correct apicv_active value. > > But apparently the memory barrier is after setting vcpu->mode. > > > > > > > For the acquire/release pair to work properly you need to 1) read > > > apicv_active *after* vcpu->mode here 2) use store_release and > > > load_acquire for vcpu->mode, respectively in vcpu_enter_guest and here. > > > > store_release for vcpu->mode in vcpu_enter_guest means a write barrier just before setting it, > > which I expected to be there. > > > > And yes I see now, I need a read barrier here as well. I am still learning this. > > Sans barriers and comments, can't this be written as returning an "error" if the > vCPU is not IN_GUEST_MODE? Effectively the same thing, but a little more precise > and it avoids duplicating the lapic.c code. Yes, beside the fact that we already set the vIRR bit so if I return -1 here, it will be set again.. (and these are set using atomic ops) I don't know how much that matters except the fact that while a vCPU runs a nested guest, callers wishing to send IPI to it, will go through this code path a lot (even when I implement nested AVIC as it is a separate thing which is used by L2 only). Best regards, Maxim Levitsky > > diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c > index 26ed5325c593..cddf7a8da3ea 100644 > --- a/arch/x86/kvm/svm/avic.c > +++ b/arch/x86/kvm/svm/avic.c > @@ -671,7 +671,7 @@ void svm_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap) > > int svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec) > { > - if (!vcpu->arch.apicv_active) > + if (vcpu->mode != IN_GUEST_MODE || !vcpu->arch.apicv_active) > return -1; > > kvm_lapic_set_irr(vec, vcpu->arch.apic); > @@ -706,8 +706,9 @@ int svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec) > put_cpu(); > } else { > /* > - * Wake the vCPU if it was blocking. KVM will then detect the > - * pending IRQ when checking if the vCPU has a wake event. > + * Wake the vCPU if it is blocking. If the vCPU exited the > + * guest since the previous vcpu->mode check, it's guaranteed > + * to see the event before re-enterring the guest. > */ > kvm_vcpu_wake_up(vcpu); > } >