From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-hyperv-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 73AA8C6FA86
	for <linux-hyperv@archiver.kernel.org>; Thu, 22 Sep 2022 09:42:47 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230019AbiIVJmq (ORCPT <rfc822;linux-hyperv@archiver.kernel.org>);
        Thu, 22 Sep 2022 05:42:46 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34834 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229744AbiIVJmp (ORCPT
        <rfc822;linux-hyperv@vger.kernel.org>);
        Thu, 22 Sep 2022 05:42:45 -0400
Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8BBAAD12DA
        for <linux-hyperv@vger.kernel.org>; Thu, 22 Sep 2022 02:42:43 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
        s=mimecast20190719; t=1663839762;
        h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
         to:to:cc:cc:mime-version:mime-version:content-type:content-type:
         in-reply-to:in-reply-to:references:references;
        bh=CqniHYpEdr1w/8IlQR97YuXzF1PQJXDdRDYRsJG7hRc=;
        b=eLQ75SjVbYGsScSYiGvVnLMLwtZxxqI6O8c9KRIhewaoDgZ+zMmbCkyGoQ+C38PQ9KKoR7
        7Z5v0+wtZNrfWdDacM4+LOeHMckujhBnlD0KkSG06GaZi+1affqLwRPA55cjQ/I9KiwhC4
        p17SDzM5lvMPatMM76fe4JipSWLmtEU=
Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com
 [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id
 us-mta-495-BXjvweecPEub2aZPw_xAhQ-1; Thu, 22 Sep 2022 05:42:41 -0400
X-MC-Unique: BXjvweecPEub2aZPw_xAhQ-1
Received: by mail-wm1-f69.google.com with SMTP id n7-20020a1c2707000000b003a638356355so3838041wmn.2
        for <linux-hyperv@vger.kernel.org>; Thu, 22 Sep 2022 02:42:41 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=mime-version:message-id:date:references:in-reply-to:subject:cc:to
         :from:x-gm-message-state:from:to:cc:subject:date;
        bh=CqniHYpEdr1w/8IlQR97YuXzF1PQJXDdRDYRsJG7hRc=;
        b=qxohAPAHfuDxCJVMb8Rkgc/oGZI6viU8pVFuUskvkT+gWd/QOSfBzgQFipZjyY0ev7
         2RG289vmTOEqkdU0J3rpVge4g+8G8SkjjaehzXbhC/Tfg0kp5yw17amXCn1/AuD+xABZ
         1F8SV1MwUzPwr0EF/h3IHR+nX/bNObAirsWt4dieT3rgRSJp+V9sTRGFD0GTEdNeYBCI
         CEnxfX6F5agYeoKTemzBveVCTBmNw+Qj8oJFhkDMNPXd4Umvx3Oh+p5GYP0ySwAZ+O4T
         82xT/Z/yUx1vzqsRIossNsADfu7jwsWk9PWA/jttYOsAhJMqZ0Ml03T+2vudZDEPN3+R
         tO0Q==
X-Gm-Message-State: ACrzQf06DmxtN2U4UcDuwmQ/OKtdLEnUScPWNNZ/ag8raCNfrEoSBBMr
        D/tUHIHxecTi5TKGsaNVSoyItzAvXDE0laTNuj7a4Rze+vaQ574siGrap22YGFPADFtj0Xkax1t
        FlL5T6GkRzoKrHgIfMLwrtWsD
X-Received: by 2002:adf:fb84:0:b0:21a:10f2:1661 with SMTP id a4-20020adffb84000000b0021a10f21661mr1417179wrr.2.1663839760259;
        Thu, 22 Sep 2022 02:42:40 -0700 (PDT)
X-Google-Smtp-Source: AMsMyM69JrDKfHLHJskgVcDVybedXF9MORZ0v+k/15GgMV4bxD2PDIMcpJftM7I2DLv537/o+S1BXA==
X-Received: by 2002:adf:fb84:0:b0:21a:10f2:1661 with SMTP id a4-20020adffb84000000b0021a10f21661mr1417153wrr.2.1663839759939;
        Thu, 22 Sep 2022 02:42:39 -0700 (PDT)
Received: from fedora (nat-2.ign.cz. [91.219.240.2])
        by smtp.gmail.com with ESMTPSA id l8-20020a5d5268000000b0022a839d053csm4677940wrc.98.2022.09.22.02.42.37
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 22 Sep 2022 02:42:38 -0700 (PDT)
From:   Vitaly Kuznetsov <vkuznets@redhat.com>
To:     Sean Christopherson <seanjc@google.com>
Cc:     kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Michael Kelley <mikelley@microsoft.com>,
        Siddharth Chandrasekaran <sidcha@amazon.de>,
        Yuan Yao <yuan.yao@linux.intel.com>,
        Maxim Levitsky <mlevitsk@redhat.com>,
        linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v10 03/39] KVM: x86: hyper-v: Introduce TLB flush fifo
In-Reply-To: <YytCKIMgiVY+kSf9@google.com>
References: <20220921152436.3673454-1-vkuznets@redhat.com>
 <20220921152436.3673454-4-vkuznets@redhat.com>
 <YytCKIMgiVY+kSf9@google.com>
Date:   Thu, 22 Sep 2022 11:42:36 +0200
Message-ID: <871qs3oicz.fsf@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain
Precedence: bulk
List-ID: <linux-hyperv.vger.kernel.org>
X-Mailing-List: linux-hyperv@vger.kernel.org

Sean Christopherson <seanjc@google.com> writes:

> On Wed, Sep 21, 2022, Vitaly Kuznetsov wrote:
>> To allow flushing individual GVAs instead of always flushing the whole
>> VPID a per-vCPU structure to pass the requests is needed. Use standard
>> 'kfifo' to queue two types of entries: individual GVA (GFN + up to 4095
>> following GFNs in the lower 12 bits) and 'flush all'.
>> 
>> The size of the fifo is arbitrary set to '16'.
>
> s/arbitrary/arbitrarily
>
>> +static void hv_tlb_flush_enqueue(struct kvm_vcpu *vcpu)
>> +{
>> +	struct kvm_vcpu_hv_tlb_flush_fifo *tlb_flush_fifo;
>> +	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
>> +	u64 flush_all_entry = KVM_HV_TLB_FLUSHALL_ENTRY;
>> +
>> +	if (!hv_vcpu)
>> +		return;
>> +
>> +	tlb_flush_fifo = &hv_vcpu->tlb_flush_fifo;
>> +
>> +	kfifo_in_spinlocked(&tlb_flush_fifo->entries, &flush_all_entry,
>> +			    1, &tlb_flush_fifo->write_lock);
>
> Unless I'm missing something, there's no need to disable IRQs, i.e. this can be
> kfifo_in_spinlocked_noirqsave() and the later patch can use spin_lock() instead
> of spin_lock_irqsave().  The only calls to hv_tlb_flush_enqueue() are from
> kvm_hv_hypercall(), i.e. it's always called from process context.
>   

Yes, no IRQ/... contexts are expected, the intention was to take the
spinlock for the shortest amount of time, not to protect against a
deadlock. This probably is not worthy and causes a confusion so I'll
remove it.

>> diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
>> index 1030b1b50552..ac30091ab346 100644
>> --- a/arch/x86/kvm/hyperv.h
>> +++ b/arch/x86/kvm/hyperv.h
>> @@ -151,4 +151,20 @@ int kvm_vm_ioctl_hv_eventfd(struct kvm *kvm, struct kvm_hyperv_eventfd *args);
>>  int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>>  		     struct kvm_cpuid_entry2 __user *entries);
>>  
>> +
>
> Unnecessary newline.
>
>> +static inline void kvm_hv_vcpu_empty_flush_tlb(struct kvm_vcpu *vcpu)
>
> What about "reset" or "purge" instead of "empty"?  "empty" is often used as query,
> e.g. list_empty(), it took me a second to realize this is a command.
>

'purge' sounds good to me!

>> +{
>> +	struct kvm_vcpu_hv_tlb_flush_fifo *tlb_flush_fifo;
>> +	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
>> +
>> +	if (!hv_vcpu || !kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu))
>> +		return;
>> +
>> +	tlb_flush_fifo = &hv_vcpu->tlb_flush_fifo;
>> +
>> +	kfifo_reset_out(&tlb_flush_fifo->entries);
>> +}
>
> Missing newline.
>
>> +void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu);
>> +
>> +
>
> One too many newlines.
>
>>  #endif
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 86504a8bfd9a..45c35c5467f8 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -3385,7 +3385,7 @@ static void kvm_vcpu_flush_tlb_all(struct kvm_vcpu *vcpu)
>>  	static_call(kvm_x86_flush_tlb_all)(vcpu);
>>  }
>>  
>> -static void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu)
>> +void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu)
>>  {
>>  	++vcpu->stat.tlb_flush;
>>  
>> @@ -3420,14 +3420,14 @@ void kvm_service_local_tlb_flush_requests(struct kvm_vcpu *vcpu)
>>  {
>>  	if (kvm_check_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu)) {
>>  		kvm_vcpu_flush_tlb_current(vcpu);
>> -		kvm_clear_request(KVM_REQ_HV_TLB_FLUSH, vcpu);
>> +		kvm_hv_vcpu_empty_flush_tlb(vcpu);
>
> It might be worth adding a comment to call out that emptying the FIFO _after_ the
> TLB flush is ok, because it's impossible for the CPU to insert TLB entries for the
> guest while running in the host.  At first glance, it looks like this (and the
> existing similar pattern in vcpu_enter_guest()) has a race where it could miss a
> TLB flush.
>
> Definitely not required, e.g. kvm_vcpu_flush_tlb_all() doesn't have a similar
> comment.  I think it's just the existence of the FIFO that made me pause.
>

Np, will add something for future generation of readers)

>>  	}
>>  
>>  	if (kvm_check_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu)) {
>>  		kvm_vcpu_flush_tlb_guest(vcpu);
>> -		kvm_clear_request(KVM_REQ_HV_TLB_FLUSH, vcpu);
>> +		kvm_hv_vcpu_empty_flush_tlb(vcpu);
>>  	} else if (kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu)) {
>> -		kvm_vcpu_flush_tlb_guest(vcpu);
>> +		kvm_hv_vcpu_flush_tlb(vcpu);
>
> Rather than expose kvm_vcpu_flush_tlb_guest() to Hyper-V, what about implementing
> this in a similar way to how way KVM-on-HyperV implements remote TLB flushes?  I.e.
> fall back to kvm_vcpu_flush_tlb_guest() if the precise flush "fails".
>
> I don't mind exposing kvm_vcpu_flush_tlb_guest(), but burying the calls inside
> Hyper-V code makes it difficult to see the relationship between KVM_REQ_HV_TLB_FLUSH
> and KVM_REQ_TLB_FLUSH_GUEST.
>
> And as a minor bonus, that also helps document that kvm_hv_vcpu_flush_tlb() doesn't
> yet support precise flushing.
>
> E.g.
>
> 	if (kvm_check_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu)) {
> 		kvm_vcpu_flush_tlb_guest(vcpu);
> 	} else if (kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu)) {
> 		/*
> 		 * Fall back to a "full" guest flush if Hyper-V's precise
> 		 * flushing fails.
> 		 */
> 		if (kvm_hv_vcpu_flush_tlb(vcpu))
> 			kvm_vcpu_flush_tlb_guest(vcpu);
> 	}
>
>
> int kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
> {
> 	struct kvm_vcpu_hv_tlb_flush_fifo *tlb_flush_fifo;
> 	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
>
> 	if (!hv_vcpu)
> 		return -EINVAL;
>
> 	tlb_flush_fifo = &hv_vcpu->tlb_flush_fifo;
>
> 	kfifo_reset_out(&tlb_flush_fifo->entries);
>
> 	/* Precise flushing isn't implemented yet. */
> 	return -EOPNOTSUPP;
> }
>

Oh, I see, certainly can be done this way, even if just to improve the
readability. Will change.

-- 
Vitaly