From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BE4BC32771 for ; Wed, 17 Aug 2022 16:41:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240344AbiHQQlK (ORCPT ); Wed, 17 Aug 2022 12:41:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51670 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238112AbiHQQlI (ORCPT ); Wed, 17 Aug 2022 12:41:08 -0400 Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 78E3D8287F for ; Wed, 17 Aug 2022 09:41:06 -0700 (PDT) Received: by mail-pj1-x102d.google.com with SMTP id t2-20020a17090a4e4200b001f21572f3a4so2489494pjl.0 for ; Wed, 17 Aug 2022 09:41:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc; bh=+ikgV9j3l03eb081Wbda2p4MulJ7Ue+5+L1IC0y+zRw=; b=Xm3zYn0XXhcqlUm6XwtiUXKvfhHNTJlwJ5rOGOdeIAENS+HTDHR8MoC+gv0BJ6FfGD jW7eeUJEVwESibuvPt/A4l7PtNTXBTNMXGGO44s25K9ny7C8mmfzN9gPZN9CMoF95Lee 3by231uAWxez9zWEhzG+ANfWtvjh6CGyhTTZ3GZk7/jqwz4GsVpY6GSqdZAWLGaQozGv M4RpuwhTP4UwCdNo+eMqiXcVD0ixxSJTKdcVqzX6rLjWQ5JV8+nlZ9fB8CMZZZ+TqazC DUeJQabE9GlXzf+rwmAt43fAXtP5vatJCMn8wgXYiKZER8aE4lYobMMod8nyWa3oOnFY qTcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc; bh=+ikgV9j3l03eb081Wbda2p4MulJ7Ue+5+L1IC0y+zRw=; b=CsCezurmezfEovZ86mnr/hx6yiwxStfszsJrRjAbPNV62VMNYD1vJxo7tNrSXHdwqM yaVjqXDJqsIc94Ocusz1Q3XLiMS3D6LCPUKQpoOwPs8vXxlVw6HY1R544nYHt+KUWS9+ ffy0uosyqjrzFbZVkOF5CeG2SdzKo37pBtIMP2ZJxY6cbeHwJBqByXkOnnXcLyir6x9J nk1rwI3omwuudi4wqjdquntyQTcnEfke9gt9fE+c1G8eIVInSAsyh9LOcoxS+0UB+nEw ljqEBe/goN5rE/h1nRiXOdZ5sSxsqaR+0EfnvL4P6l4qeb8DCHEwvOAXFq8bB9v2+ygo 9ZNQ== X-Gm-Message-State: ACgBeo2C4hf4CrJcqXLAtEJ7vTFrDp6AcKRuMEirIXtYRzvsb+1VjoJs KhpJKDPZztlLjwslPJkqZnPLuA== X-Google-Smtp-Source: AA6agR5bKAhFJJYgDr81IUOVX+n51nW4MVPub70A4/x+j2SHk2CJkiB9rNZ00QaD5XiSVlzNhahDxw== X-Received: by 2002:a17:90b:4c8d:b0:1f5:29ef:4a36 with SMTP id my13-20020a17090b4c8d00b001f529ef4a36mr4688702pjb.127.1660754466127; Wed, 17 Aug 2022 09:41:06 -0700 (PDT) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id f8-20020a170902684800b001728ac8af94sm111593pln.248.2022.08.17.09.41.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Aug 2022 09:41:05 -0700 (PDT) Date: Wed, 17 Aug 2022 16:41:01 +0000 From: Sean Christopherson To: Paolo Bonzini Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, mlevitsk@redhat.com, vkuznets@redhat.com Subject: Re: [PATCH v2 2/9] KVM: x86: remove return value of kvm_vcpu_block Message-ID: References: <20220811210605.402337-1-pbonzini@redhat.com> <20220811210605.402337-3-pbonzini@redhat.com> <78616cf8-2693-72cc-c2cc-5a849116ffc7@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <78616cf8-2693-72cc-c2cc-5a849116ffc7@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 17, 2022, Paolo Bonzini wrote: > On 8/17/22 01:34, Sean Christopherson wrote: > > Isn't freeing up the return from kvm_vcpu_check_block() unnecessary? Can't we > > just do: > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 9f11b505cbee..ccb9f8bdeb18 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -10633,7 +10633,7 @@ static inline int vcpu_block(struct kvm_vcpu *vcpu) > > if (hv_timer) > > kvm_lapic_switch_to_hv_timer(vcpu); > > > > - if (!kvm_check_request(KVM_REQ_UNHALT, vcpu)) > > + if (!kvm_arch_vcpu_runnable(vcpu)) > > return 1; > > } > > > > > > which IMO is more intuitive and doesn't require reworking halt-polling (again). > > This was my first idea indeed. However I didn't like calling > kvm_arch_vcpu_runnable() again and "did it schedule()" seemed to be a less > interesting result from kvm_vcpu_block() (and in fact kvm_vcpu_halt() does > not bother passing it up the return chain). The flip side of calling kvm_arch_vcpu_runnable() again is that KVM will immediately wake the vCPU if it becomes runnable after kvm_vcpu_check_block(). The edge cases where the vCPU becomes runnable late are unlikely to truly matter in practice, but on the other hand manually re-checking kvm_arch_vcpu_runnable() means KVM gets both cases "right" (waited=true iff vCPU actually waited, vCPU awakened ASAP), whereas squishing the information into the return of kvm_vcpu_check_block() means KVM gets both cases "wrong" (waited=true even if schedule() was never called, vCPU left in a non-running state even though it's runnable). My only hesitation with calling kvm_arch_vcpu_runnable() again is that it could be problematic if KVM somehow managed to consume the event that caused kvm_vcpu_has_events() to return true, but I don't see how that could happen without it being a KVM bug. Ah, and if we do go with the explicit check, it should come with a comment to call out that KVM needs to return up the stack, e.g. /* * If KVM stopped blocking the vCPU but the vCPU is still not * runnable, then there is a pending host event of some kind, * e.g. a pending signal. Return up the stack to service the * pending event without changing the vCPU's activity state. */ if (!kvm_arch_vcpu_runnable(vcpu)) return 1; so that we don't end combining the checks into: while (!kvm_arch_vcpu_runnable(vcpu)) { /* * Switch to the software timer before halt-polling/blocking as * the guest's timer may be a break event for the vCPU, and the * hypervisor timer runs only when the CPU is in guest mode. * Switch before halt-polling so that KVM recognizes an expired * timer before blocking. */ hv_timer = kvm_lapic_hv_timer_in_use(vcpu); if (hv_timer) kvm_lapic_switch_to_sw_timer(vcpu); kvm_vcpu_srcu_read_unlock(vcpu); if (vcpu->arch.mp_state == KVM_MP_STATE_HALTED) kvm_vcpu_halt(vcpu); else kvm_vcpu_block(vcpu); kvm_vcpu_srcu_read_lock(vcpu); if (hv_timer) kvm_lapic_switch_to_hv_timer(vcpu); } which looks sane, but would mean KVM never bounces out to handle whatever event needs handling. Side topic, usage of kvm_apic_accept_events() appears to be broken (though nothing can trigger the bug). If kvm_apic_accept_events() were to return an -errno, then kvm_arch_vcpu_ioctl_run() would return '0' to userspace without updating vcpu->run->exit_reason. I think an easy fix is to drop the return value entirely and then WARN if kvm_check_nested_events() returns something other than -EBUSY. if (is_guest_mode(vcpu)) { r = kvm_check_nested_events(vcpu); if (r < 0) { WARN_ON_ONCE(r != -EBUSY); return; }