From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B0D5C43217 for ; Wed, 16 Nov 2022 17:19:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234752AbiKPRTt (ORCPT ); Wed, 16 Nov 2022 12:19:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233809AbiKPRTi (ORCPT ); Wed, 16 Nov 2022 12:19:38 -0500 Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B3DA91E8 for ; Wed, 16 Nov 2022 09:19:33 -0800 (PST) Received: by mail-pf1-x432.google.com with SMTP id 140so16631387pfz.6 for ; Wed, 16 Nov 2022 09:19:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=yd++ObN+WrxyzCV6eu+IuUH2K33aygMDqpLl7YqP9bQ=; b=K5O1OhO9WbRzwyTmhYztgpVTLFrj0jLsQw/X0Y2COhlWg/KCpnb3BuQ/3DWYU/QUyy yn6MmJVPG4DpxPGCodjTtHXveMagw2rprK5uLT+OwUN4vOCoVyvMx/lewpri+A2NUKw7 IATvHrc/GE8CCvbn/3FH8sh8WbpgcGvRCKKhDX9YmnXDsYO8cpZ4JyKmwtEMBqutYZ14 Ff2cuZRp8Ae8g/u5S2WKIjd493CjipYO/FbPKGh3i4ORxyz3HFK4UsZLQAkDnl2UWzVP 4w3mooHuX4QS++G0LYjLJnrSgQXT6koq357Hksp4BgysnMT4kd4YH7s6t6H6qB7ovrYS 2LaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=yd++ObN+WrxyzCV6eu+IuUH2K33aygMDqpLl7YqP9bQ=; b=GR0Te3xljwpJBNyoonF70vsOaIM3LPNKLDBGHN3rj0xASyQUDF8ZR9kmrFwftG6GEw 45sFVP8vSi2H56zD2jp7TDmVMjXLh11WawrsravuMN33AVMDI8tkk4Zf6seRsLf0eRQo 63wcJpelw9i0Ajet7SIXzxMIinpnPXt4blPckuEBGwnzQiPof46ZgwAvic1kO8zDN9o9 kuqbv41+QYH4FiMZ4QAja6AwhJ38vPvTPPf0HWOEhOxjmsoD/ybumfyo7Embt8hfM7yO frg17xikhMgM2FaNmjls6f2NhQw+Wnj5g3cttzIS+mg9LpyuZbqnTX5EGzmCmrES9lE6 +PBQ== X-Gm-Message-State: ANoB5plC/gudTEmwQIOvBS4lZ8uRzueR8pz5ntfJ/cYrYlb45YdzRwdm UzjECzmgaiCxgQ6h3FEHk0w/2Q== X-Google-Smtp-Source: AA0mqf7vO0cqTPp+H1qbd2Dp5/UFD3iMJrWGOOmHhwF4ArV24KELQAGgbhwMNKl1+aZ5S+A+ooIaeA== X-Received: by 2002:a63:221a:0:b0:464:3985:3c63 with SMTP id i26-20020a63221a000000b0046439853c63mr21258516pgi.141.1668619173006; Wed, 16 Nov 2022 09:19:33 -0800 (PST) Received: from google.com (223.103.125.34.bc.googleusercontent.com. [34.125.103.223]) by smtp.gmail.com with ESMTPSA id k26-20020aa7999a000000b00561382a5a25sm11102299pfh.26.2022.11.16.09.19.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Nov 2022 09:19:32 -0800 (PST) Date: Wed, 16 Nov 2022 09:19:28 -0800 From: David Matlack To: "wangyanan (Y)" Cc: Sean Christopherson , Paolo Bonzini , Wanpeng Li , kvm , David Hildenbrand , "Kernel Mailing List, Linux" , Paul Mackerras , Claudio Imbrenda , KVM ARM , Janosch Frank , Marc Zyngier , Joerg Roedel , Huacai Chen , Christian Borntraeger , Aleksandar Markovic , Jon Cargille , kvm-ppc , linux-arm-kernel , Jim Mattson , Cornelia Huck , "open list:MIPS" , Vitaly Kuznetsov Subject: Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) Message-ID: References: <20210925005528.1145584-1-seanjc@google.com> <03f2f5ab-e809-2ba5-bd98-3393c3b843d2@de.ibm.com> <43e42f5c-9d9f-9e8b-3a61-9a053a818250@de.ibm.com> <32810c89-44c6-6780-9d05-e49f6b897b6e@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <32810c89-44c6-6780-9d05-e49f6b897b6e@huawei.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 15, 2022 at 11:28:56AM +0800, wangyanan (Y) wrote: > Hi Sean, Paolo, > > I recently also notice the behavior change of param halt_poll_ns. > Now it loses the ability to: > 1) dynamically disable halt polling for all the running VMs > by `echo 0 > /sys` > 2) dynamically adjust the halt polling interval for all the > running VMs by `echo * > /sys` > > While in our cases, we usually use above two abilities, and > KVM_CAP_HALT_POLL is not used yet. I think the right path forward is to make KVM_CAP_HALT_POLL a pure override of halt_poll_ns, and restore the pre-existing behavior of halt_poll_ns whenever KVM_CAP_HALT_POLL is not used. e.g. see the patch below. That will fix issues (1) and (2) above for any VM not using KVM_CAP_HALT_POLL. If a VM is using KVM_CAP_HALT_POLL, it will ignore all changes to halt_poll_ns. If we truly need a mechanism for admins to disable halt-polling on VMs using KVM_CAP_HALT_POLL, we can introduce a separate module parameter for that. But IMO, any setup that is sophisticated enough to use KVM_CAP_HALT_POLL should also be able to use KVM_CAP_HALT_POLL to disable halt polling. If everyone is happy with this approach I can test and send a real patch to the mailing list. diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index e6e66c5e56f2..253ad055b6ad 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -788,6 +788,7 @@ struct kvm { struct srcu_struct srcu; struct srcu_struct irq_srcu; pid_t userspace_pid; + bool override_halt_poll_ns; unsigned int max_halt_poll_ns; u32 dirty_ring_size; bool vm_bugged; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 43bbe4fde078..479d0d0da0b5 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1198,8 +1198,6 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname) goto out_err_no_arch_destroy_vm; } - kvm->max_halt_poll_ns = halt_poll_ns; - r = kvm_arch_init_vm(kvm, type); if (r) goto out_err_no_arch_destroy_vm; @@ -3371,7 +3369,7 @@ void kvm_sigset_deactivate(struct kvm_vcpu *vcpu) sigemptyset(¤t->real_blocked); } -static void grow_halt_poll_ns(struct kvm_vcpu *vcpu) +static void grow_halt_poll_ns(struct kvm_vcpu *vcpu, unsigned int max) { unsigned int old, val, grow, grow_start; @@ -3385,8 +3383,8 @@ static void grow_halt_poll_ns(struct kvm_vcpu *vcpu) if (val < grow_start) val = grow_start; - if (val > vcpu->kvm->max_halt_poll_ns) - val = vcpu->kvm->max_halt_poll_ns; + if (val > max) + val = max; vcpu->halt_poll_ns = val; out: @@ -3501,10 +3499,17 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu) { bool halt_poll_allowed = !kvm_arch_no_poll(vcpu); bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns; + unsigned int max_halt_poll_ns; ktime_t start, cur, poll_end; + struct kvm *kvm = vcpu->kvm; bool waited = false; u64 halt_ns; + if (kvm->override_halt_poll_ns) + max_halt_poll_ns = kvm->max_halt_poll_ns; + else + max_halt_poll_ns = READ_ONCE(halt_poll_ns); + start = cur = poll_end = ktime_get(); if (do_halt_poll) { ktime_t stop = ktime_add_ns(start, vcpu->halt_poll_ns); @@ -3545,17 +3550,16 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu) if (halt_poll_allowed) { if (!vcpu_valid_wakeup(vcpu)) { shrink_halt_poll_ns(vcpu); - } else if (vcpu->kvm->max_halt_poll_ns) { + } else if (max_halt_poll_ns) { if (halt_ns <= vcpu->halt_poll_ns) ; /* we had a long block, shrink polling */ - else if (vcpu->halt_poll_ns && - halt_ns > vcpu->kvm->max_halt_poll_ns) + else if (vcpu->halt_poll_ns && halt_ns > max_halt_poll_ns) shrink_halt_poll_ns(vcpu); /* we had a short halt and our poll time is too small */ - else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns && - halt_ns < vcpu->kvm->max_halt_poll_ns) - grow_halt_poll_ns(vcpu); + else if (vcpu->halt_poll_ns < max_halt_poll_ns && + halt_ns < max_halt_poll_ns) + grow_halt_poll_ns(vcpu, max_halt_poll_ns); } else { vcpu->halt_poll_ns = 0; } @@ -4588,6 +4592,7 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm, if (cap->flags || cap->args[0] != (unsigned int)cap->args[0]) return -EINVAL; + kvm->override_halt_poll_ns = true; kvm->max_halt_poll_ns = cap->args[0]; return 0; } > > On 2021/9/28 1:33, Sean Christopherson wrote: > > On Mon, Sep 27, 2021, Paolo Bonzini wrote: > > > On Mon, Sep 27, 2021 at 5:17 PM Christian Borntraeger > > > wrote: > > > > > So I think there are two possibilities that makes sense: > > > > > > > > > > * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that > > > > what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did. > > > Yes, that's what I meant. David pointed out that doesn't allow you to > > > disable halt polling altogether, but for that you can always ask each > > > VM's userspace one by one, or just not use KVM_CAP_HALT_POLL. (Also, I > > > don't know about Google's usecase, but mine was actually more about > > > using KVM_CAP_HALT_POLL to *disable* halt polling on some VMs!). > > I kinda like the idea if special-casing halt_poll_ns=0, e.g. for testing or > > in-the-field mitigation if halt-polling is broken. It'd be trivial to support, e.g. > Do we have any plan to repost the diff as a fix? > I would be very nice that this issue can be solved. > > Besides, I think we may need some Doc for users to describe > how halt_poll_ns works with KVM_CAP_HALT_POLL, like > "Documentation/virt/guest-halt-polling.rst". > > @@ -3304,19 +3304,23 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu) > > update_halt_poll_stats(vcpu, start, poll_end, !waited); > > > > if (halt_poll_allowed) { > > + max_halt_poll_ns = vcpu->kvm->max_halt_poll_ns; > > + if (!max_halt_poll_ns || !halt_poll_ns) <------ squish the max if halt_poll_ns==0 > > + max_halt_poll_ns = halt_poll_ns; > > + > Does this mean that KVM_CAP_HALT_POLL will not be able to > disable halt polling for a VM individually when halt_poll_ns !=0? > > if (!vcpu_valid_wakeup(vcpu)) { > > shrink_halt_poll_ns(vcpu); > > - } else if (vcpu->kvm->max_halt_poll_ns) { > > + } else if (max_halt_poll_ns) { > > if (halt_ns <= vcpu->halt_poll_ns) > > ; > > /* we had a long block, shrink polling */ > > else if (vcpu->halt_poll_ns && > > - halt_ns > vcpu->kvm->max_halt_poll_ns) > > + halt_ns > max_halt_poll_ns) > > shrink_halt_poll_ns(vcpu); > > /* we had a short halt and our poll time is too small */ > > - else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns && > > - halt_ns < vcpu->kvm->max_halt_poll_ns) > > - grow_halt_poll_ns(vcpu); > > + else if (vcpu->halt_poll_ns < max_halt_poll_ns && > > + halt_ns < max_halt_poll_ns) > > + grow_halt_poll_ns(vcpu, max_halt_poll_ns); > > } else { > > vcpu->halt_poll_ns = 0; > > } > > _______________________________________________ > > kvmarm mailing list > > kvmarm@lists.cs.columbia.edu > > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm > > . > Thanks, > Yanan From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mm01.cs.columbia.edu (mm01.cs.columbia.edu [128.59.11.253]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FA5CC43217 for ; Wed, 16 Nov 2022 17:19:39 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id BC1AE4B852; Wed, 16 Nov 2022 12:19:38 -0500 (EST) X-Virus-Scanned: at lists.cs.columbia.edu Authentication-Results: mm01.cs.columbia.edu (amavisd-new); dkim=softfail (fail, message has been altered) header.i=@google.com Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aeM8L6G4Pr5b; Wed, 16 Nov 2022 12:19:37 -0500 (EST) Received: from mm01.cs.columbia.edu (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 5783D4B85B; Wed, 16 Nov 2022 12:19:37 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id C3C654B631 for ; Wed, 16 Nov 2022 12:19:35 -0500 (EST) X-Virus-Scanned: at lists.cs.columbia.edu Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TIHG+fHln7ck for ; Wed, 16 Nov 2022 12:19:34 -0500 (EST) Received: from mail-pf1-f178.google.com (mail-pf1-f178.google.com [209.85.210.178]) by mm01.cs.columbia.edu (Postfix) with ESMTPS id 44B814B2FF for ; Wed, 16 Nov 2022 12:19:34 -0500 (EST) Received: by mail-pf1-f178.google.com with SMTP id k15so18110008pfg.2 for ; Wed, 16 Nov 2022 09:19:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=yd++ObN+WrxyzCV6eu+IuUH2K33aygMDqpLl7YqP9bQ=; b=K5O1OhO9WbRzwyTmhYztgpVTLFrj0jLsQw/X0Y2COhlWg/KCpnb3BuQ/3DWYU/QUyy yn6MmJVPG4DpxPGCodjTtHXveMagw2rprK5uLT+OwUN4vOCoVyvMx/lewpri+A2NUKw7 IATvHrc/GE8CCvbn/3FH8sh8WbpgcGvRCKKhDX9YmnXDsYO8cpZ4JyKmwtEMBqutYZ14 Ff2cuZRp8Ae8g/u5S2WKIjd493CjipYO/FbPKGh3i4ORxyz3HFK4UsZLQAkDnl2UWzVP 4w3mooHuX4QS++G0LYjLJnrSgQXT6koq357Hksp4BgysnMT4kd4YH7s6t6H6qB7ovrYS 2LaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=yd++ObN+WrxyzCV6eu+IuUH2K33aygMDqpLl7YqP9bQ=; b=aGIJXxm3boR7y9wje7Wv0ZxtFqEUUpqMUv3Msl+Cf3M7fAfr2IGOWchnNH7BOLzy06 Sf8/lhSn9gVdeOqN6S5y+UBbnHN+2+fU7nM8uMV2WTja5mV5BqrRmm07Vzx2jGDgGQss Np4Xmy2a61hsal0E1fnS0SeUI9ciGBK0CeH+RPQmd/i8/X9xBav0RVkQyG1PAAyZYT7t B2OOR8m/bMLfj7I3LNb4+ZxdPnzJxgx2mCse4of6juvNJy7P0c36AAQdNwhlHKckVA38 JXK19PBX8PTDJ8AukorWKs0HFyUe0KydUbG0965jleUNJeYJv/OxqlM6C0TnGGPTwwch 5dRw== X-Gm-Message-State: ANoB5pnypesv9NDyXRbMpn8UV2qVbs6bhtcfCY2eGTw1m57zha8P2RJ+ Yilz+FSKtS8RSx6te28LzAyOoA== X-Google-Smtp-Source: AA0mqf7vO0cqTPp+H1qbd2Dp5/UFD3iMJrWGOOmHhwF4ArV24KELQAGgbhwMNKl1+aZ5S+A+ooIaeA== X-Received: by 2002:a63:221a:0:b0:464:3985:3c63 with SMTP id i26-20020a63221a000000b0046439853c63mr21258516pgi.141.1668619173006; Wed, 16 Nov 2022 09:19:33 -0800 (PST) Received: from google.com (223.103.125.34.bc.googleusercontent.com. [34.125.103.223]) by smtp.gmail.com with ESMTPSA id k26-20020aa7999a000000b00561382a5a25sm11102299pfh.26.2022.11.16.09.19.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Nov 2022 09:19:32 -0800 (PST) Date: Wed, 16 Nov 2022 09:19:28 -0800 From: David Matlack To: "wangyanan (Y)" Subject: Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) Message-ID: References: <20210925005528.1145584-1-seanjc@google.com> <03f2f5ab-e809-2ba5-bd98-3393c3b843d2@de.ibm.com> <43e42f5c-9d9f-9e8b-3a61-9a053a818250@de.ibm.com> <32810c89-44c6-6780-9d05-e49f6b897b6e@huawei.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <32810c89-44c6-6780-9d05-e49f6b897b6e@huawei.com> Cc: Wanpeng Li , kvm , David Hildenbrand , "open list:MIPS" , Paul Mackerras , Claudio Imbrenda , KVM ARM , Janosch Frank , Marc Zyngier , Joerg Roedel , Huacai Chen , Christian Borntraeger , Aleksandar Markovic , Jon Cargille , kvm-ppc , linux-arm-kernel , Jim Mattson , Cornelia Huck , "Kernel Mailing List, Linux" , Paolo Bonzini , Vitaly Kuznetsov X-BeenThere: kvmarm@lists.cs.columbia.edu X-Mailman-Version: 2.1.14 Precedence: list List-Id: Where KVM/ARM decisions are made List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu On Tue, Nov 15, 2022 at 11:28:56AM +0800, wangyanan (Y) wrote: > Hi Sean, Paolo, > > I recently also notice the behavior change of param halt_poll_ns. > Now it loses the ability to: > 1) dynamically disable halt polling for all the running VMs > by `echo 0 > /sys` > 2) dynamically adjust the halt polling interval for all the > running VMs by `echo * > /sys` > > While in our cases, we usually use above two abilities, and > KVM_CAP_HALT_POLL is not used yet. I think the right path forward is to make KVM_CAP_HALT_POLL a pure override of halt_poll_ns, and restore the pre-existing behavior of halt_poll_ns whenever KVM_CAP_HALT_POLL is not used. e.g. see the patch below. That will fix issues (1) and (2) above for any VM not using KVM_CAP_HALT_POLL. If a VM is using KVM_CAP_HALT_POLL, it will ignore all changes to halt_poll_ns. If we truly need a mechanism for admins to disable halt-polling on VMs using KVM_CAP_HALT_POLL, we can introduce a separate module parameter for that. But IMO, any setup that is sophisticated enough to use KVM_CAP_HALT_POLL should also be able to use KVM_CAP_HALT_POLL to disable halt polling. If everyone is happy with this approach I can test and send a real patch to the mailing list. diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index e6e66c5e56f2..253ad055b6ad 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -788,6 +788,7 @@ struct kvm { struct srcu_struct srcu; struct srcu_struct irq_srcu; pid_t userspace_pid; + bool override_halt_poll_ns; unsigned int max_halt_poll_ns; u32 dirty_ring_size; bool vm_bugged; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 43bbe4fde078..479d0d0da0b5 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1198,8 +1198,6 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname) goto out_err_no_arch_destroy_vm; } - kvm->max_halt_poll_ns = halt_poll_ns; - r = kvm_arch_init_vm(kvm, type); if (r) goto out_err_no_arch_destroy_vm; @@ -3371,7 +3369,7 @@ void kvm_sigset_deactivate(struct kvm_vcpu *vcpu) sigemptyset(¤t->real_blocked); } -static void grow_halt_poll_ns(struct kvm_vcpu *vcpu) +static void grow_halt_poll_ns(struct kvm_vcpu *vcpu, unsigned int max) { unsigned int old, val, grow, grow_start; @@ -3385,8 +3383,8 @@ static void grow_halt_poll_ns(struct kvm_vcpu *vcpu) if (val < grow_start) val = grow_start; - if (val > vcpu->kvm->max_halt_poll_ns) - val = vcpu->kvm->max_halt_poll_ns; + if (val > max) + val = max; vcpu->halt_poll_ns = val; out: @@ -3501,10 +3499,17 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu) { bool halt_poll_allowed = !kvm_arch_no_poll(vcpu); bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns; + unsigned int max_halt_poll_ns; ktime_t start, cur, poll_end; + struct kvm *kvm = vcpu->kvm; bool waited = false; u64 halt_ns; + if (kvm->override_halt_poll_ns) + max_halt_poll_ns = kvm->max_halt_poll_ns; + else + max_halt_poll_ns = READ_ONCE(halt_poll_ns); + start = cur = poll_end = ktime_get(); if (do_halt_poll) { ktime_t stop = ktime_add_ns(start, vcpu->halt_poll_ns); @@ -3545,17 +3550,16 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu) if (halt_poll_allowed) { if (!vcpu_valid_wakeup(vcpu)) { shrink_halt_poll_ns(vcpu); - } else if (vcpu->kvm->max_halt_poll_ns) { + } else if (max_halt_poll_ns) { if (halt_ns <= vcpu->halt_poll_ns) ; /* we had a long block, shrink polling */ - else if (vcpu->halt_poll_ns && - halt_ns > vcpu->kvm->max_halt_poll_ns) + else if (vcpu->halt_poll_ns && halt_ns > max_halt_poll_ns) shrink_halt_poll_ns(vcpu); /* we had a short halt and our poll time is too small */ - else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns && - halt_ns < vcpu->kvm->max_halt_poll_ns) - grow_halt_poll_ns(vcpu); + else if (vcpu->halt_poll_ns < max_halt_poll_ns && + halt_ns < max_halt_poll_ns) + grow_halt_poll_ns(vcpu, max_halt_poll_ns); } else { vcpu->halt_poll_ns = 0; } @@ -4588,6 +4592,7 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm, if (cap->flags || cap->args[0] != (unsigned int)cap->args[0]) return -EINVAL; + kvm->override_halt_poll_ns = true; kvm->max_halt_poll_ns = cap->args[0]; return 0; } > > On 2021/9/28 1:33, Sean Christopherson wrote: > > On Mon, Sep 27, 2021, Paolo Bonzini wrote: > > > On Mon, Sep 27, 2021 at 5:17 PM Christian Borntraeger > > > wrote: > > > > > So I think there are two possibilities that makes sense: > > > > > > > > > > * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that > > > > what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did. > > > Yes, that's what I meant. David pointed out that doesn't allow you to > > > disable halt polling altogether, but for that you can always ask each > > > VM's userspace one by one, or just not use KVM_CAP_HALT_POLL. (Also, I > > > don't know about Google's usecase, but mine was actually more about > > > using KVM_CAP_HALT_POLL to *disable* halt polling on some VMs!). > > I kinda like the idea if special-casing halt_poll_ns=0, e.g. for testing or > > in-the-field mitigation if halt-polling is broken. It'd be trivial to support, e.g. > Do we have any plan to repost the diff as a fix? > I would be very nice that this issue can be solved. > > Besides, I think we may need some Doc for users to describe > how halt_poll_ns works with KVM_CAP_HALT_POLL, like > "Documentation/virt/guest-halt-polling.rst". > > @@ -3304,19 +3304,23 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu) > > update_halt_poll_stats(vcpu, start, poll_end, !waited); > > > > if (halt_poll_allowed) { > > + max_halt_poll_ns = vcpu->kvm->max_halt_poll_ns; > > + if (!max_halt_poll_ns || !halt_poll_ns) <------ squish the max if halt_poll_ns==0 > > + max_halt_poll_ns = halt_poll_ns; > > + > Does this mean that KVM_CAP_HALT_POLL will not be able to > disable halt polling for a VM individually when halt_poll_ns !=0? > > if (!vcpu_valid_wakeup(vcpu)) { > > shrink_halt_poll_ns(vcpu); > > - } else if (vcpu->kvm->max_halt_poll_ns) { > > + } else if (max_halt_poll_ns) { > > if (halt_ns <= vcpu->halt_poll_ns) > > ; > > /* we had a long block, shrink polling */ > > else if (vcpu->halt_poll_ns && > > - halt_ns > vcpu->kvm->max_halt_poll_ns) > > + halt_ns > max_halt_poll_ns) > > shrink_halt_poll_ns(vcpu); > > /* we had a short halt and our poll time is too small */ > > - else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns && > > - halt_ns < vcpu->kvm->max_halt_poll_ns) > > - grow_halt_poll_ns(vcpu); > > + else if (vcpu->halt_poll_ns < max_halt_poll_ns && > > + halt_ns < max_halt_poll_ns) > > + grow_halt_poll_ns(vcpu, max_halt_poll_ns); > > } else { > > vcpu->halt_poll_ns = 0; > > } > > _______________________________________________ > > kvmarm mailing list > > kvmarm@lists.cs.columbia.edu > > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm > > . > Thanks, > Yanan _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8786FC4332F for ; Wed, 16 Nov 2022 17:20:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=omnPGczk59hQQpxrS7uce4kwsF6TlYJWgpxRQfTGDZU=; b=4+16/kmUkYHuf9 H3vLq4ZSYgA8ja/vGuyT5Ro28ceM8NyYDXMi5LrJXKN5iYRU5COG+6KTac4b5kRMgmf7CwTRAww9e ITpjMpclzyiXdGW2xXa9iD2QwhYnyKlfVnvV4W36atFAgTc3dIAd1lUFx+Fu9ft//cUCbD/nu3s/q +9Ork9a9kb9XRmW/6LcyauCNKBgrZuJKCdnVqLFWDzIeCX/uKs3g5yIqoj7La3fQ1iCVoTChVPfge 5qRMMcZRXDgiX+4TnI3tEdVrVo/xYvRGE7HiU78H4Y6UbmLycLCxuEfyiLmjko0KJOYQRkcX/4GO4 YT+NhZDESAopnnXQDEwQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1ovM4c-006N8G-NF; Wed, 16 Nov 2022 17:19:42 +0000 Received: from mail-pf1-x434.google.com ([2607:f8b0:4864:20::434]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1ovM4X-006N5G-Q2 for linux-arm-kernel@lists.infradead.org; Wed, 16 Nov 2022 17:19:41 +0000 Received: by mail-pf1-x434.google.com with SMTP id y203so18107415pfb.4 for ; Wed, 16 Nov 2022 09:19:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=yd++ObN+WrxyzCV6eu+IuUH2K33aygMDqpLl7YqP9bQ=; b=K5O1OhO9WbRzwyTmhYztgpVTLFrj0jLsQw/X0Y2COhlWg/KCpnb3BuQ/3DWYU/QUyy yn6MmJVPG4DpxPGCodjTtHXveMagw2rprK5uLT+OwUN4vOCoVyvMx/lewpri+A2NUKw7 IATvHrc/GE8CCvbn/3FH8sh8WbpgcGvRCKKhDX9YmnXDsYO8cpZ4JyKmwtEMBqutYZ14 Ff2cuZRp8Ae8g/u5S2WKIjd493CjipYO/FbPKGh3i4ORxyz3HFK4UsZLQAkDnl2UWzVP 4w3mooHuX4QS++G0LYjLJnrSgQXT6koq357Hksp4BgysnMT4kd4YH7s6t6H6qB7ovrYS 2LaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=yd++ObN+WrxyzCV6eu+IuUH2K33aygMDqpLl7YqP9bQ=; b=QgdxTvmOgVM+rchFC7fuWVhSgTo9nTqXy6POafdytq1HlxiGvm/9aDQK67V97ZdrvP pJ7Nbw7RQDscUSow5IHy0RpuXaiJUJQ9S/gkwVZgE6dMT4tixRl+mW2zjUrgXj5RFjrs PoBN6QhqICEMuHhJfuNIy+Q87gWjl+vQfeDJUjtN/3eeienmpWR/BXLA/y8VCvrJ+rXq VfpdQ9feNRL7GcW2YoYo/VwKBhgmpP9BrXC0PxirGW03NZx77oZvACPV6tviTrJdafLL p2shga+KjIbmfMjAy6qWz9svZ+swM5cc8475hhj5gecf25SJv5zVPPT91ACPojgLH1+z RRxg== X-Gm-Message-State: ANoB5pkxN+2QFi5mtFss4y8bkc/EVrcjpn53Au+UvgHfeWOhvG37+l4G FAM3LGEUp5cVWyGfXnpk4FUMDA== X-Google-Smtp-Source: AA0mqf7vO0cqTPp+H1qbd2Dp5/UFD3iMJrWGOOmHhwF4ArV24KELQAGgbhwMNKl1+aZ5S+A+ooIaeA== X-Received: by 2002:a63:221a:0:b0:464:3985:3c63 with SMTP id i26-20020a63221a000000b0046439853c63mr21258516pgi.141.1668619173006; Wed, 16 Nov 2022 09:19:33 -0800 (PST) Received: from google.com (223.103.125.34.bc.googleusercontent.com. [34.125.103.223]) by smtp.gmail.com with ESMTPSA id k26-20020aa7999a000000b00561382a5a25sm11102299pfh.26.2022.11.16.09.19.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Nov 2022 09:19:32 -0800 (PST) Date: Wed, 16 Nov 2022 09:19:28 -0800 From: David Matlack To: "wangyanan (Y)" Cc: Sean Christopherson , Paolo Bonzini , Wanpeng Li , kvm , David Hildenbrand , "Kernel Mailing List, Linux" , Paul Mackerras , Claudio Imbrenda , KVM ARM , Janosch Frank , Marc Zyngier , Joerg Roedel , Huacai Chen , Christian Borntraeger , Aleksandar Markovic , Jon Cargille , kvm-ppc , linux-arm-kernel , Jim Mattson , Cornelia Huck , "open list:MIPS" , Vitaly Kuznetsov Subject: Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) Message-ID: References: <20210925005528.1145584-1-seanjc@google.com> <03f2f5ab-e809-2ba5-bd98-3393c3b843d2@de.ibm.com> <43e42f5c-9d9f-9e8b-3a61-9a053a818250@de.ibm.com> <32810c89-44c6-6780-9d05-e49f6b897b6e@huawei.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <32810c89-44c6-6780-9d05-e49f6b897b6e@huawei.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221116_091937_871434_C1421130 X-CRM114-Status: GOOD ( 46.04 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, Nov 15, 2022 at 11:28:56AM +0800, wangyanan (Y) wrote: > Hi Sean, Paolo, > > I recently also notice the behavior change of param halt_poll_ns. > Now it loses the ability to: > 1) dynamically disable halt polling for all the running VMs > by `echo 0 > /sys` > 2) dynamically adjust the halt polling interval for all the > running VMs by `echo * > /sys` > > While in our cases, we usually use above two abilities, and > KVM_CAP_HALT_POLL is not used yet. I think the right path forward is to make KVM_CAP_HALT_POLL a pure override of halt_poll_ns, and restore the pre-existing behavior of halt_poll_ns whenever KVM_CAP_HALT_POLL is not used. e.g. see the patch below. That will fix issues (1) and (2) above for any VM not using KVM_CAP_HALT_POLL. If a VM is using KVM_CAP_HALT_POLL, it will ignore all changes to halt_poll_ns. If we truly need a mechanism for admins to disable halt-polling on VMs using KVM_CAP_HALT_POLL, we can introduce a separate module parameter for that. But IMO, any setup that is sophisticated enough to use KVM_CAP_HALT_POLL should also be able to use KVM_CAP_HALT_POLL to disable halt polling. If everyone is happy with this approach I can test and send a real patch to the mailing list. diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index e6e66c5e56f2..253ad055b6ad 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -788,6 +788,7 @@ struct kvm { struct srcu_struct srcu; struct srcu_struct irq_srcu; pid_t userspace_pid; + bool override_halt_poll_ns; unsigned int max_halt_poll_ns; u32 dirty_ring_size; bool vm_bugged; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 43bbe4fde078..479d0d0da0b5 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1198,8 +1198,6 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname) goto out_err_no_arch_destroy_vm; } - kvm->max_halt_poll_ns = halt_poll_ns; - r = kvm_arch_init_vm(kvm, type); if (r) goto out_err_no_arch_destroy_vm; @@ -3371,7 +3369,7 @@ void kvm_sigset_deactivate(struct kvm_vcpu *vcpu) sigemptyset(¤t->real_blocked); } -static void grow_halt_poll_ns(struct kvm_vcpu *vcpu) +static void grow_halt_poll_ns(struct kvm_vcpu *vcpu, unsigned int max) { unsigned int old, val, grow, grow_start; @@ -3385,8 +3383,8 @@ static void grow_halt_poll_ns(struct kvm_vcpu *vcpu) if (val < grow_start) val = grow_start; - if (val > vcpu->kvm->max_halt_poll_ns) - val = vcpu->kvm->max_halt_poll_ns; + if (val > max) + val = max; vcpu->halt_poll_ns = val; out: @@ -3501,10 +3499,17 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu) { bool halt_poll_allowed = !kvm_arch_no_poll(vcpu); bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns; + unsigned int max_halt_poll_ns; ktime_t start, cur, poll_end; + struct kvm *kvm = vcpu->kvm; bool waited = false; u64 halt_ns; + if (kvm->override_halt_poll_ns) + max_halt_poll_ns = kvm->max_halt_poll_ns; + else + max_halt_poll_ns = READ_ONCE(halt_poll_ns); + start = cur = poll_end = ktime_get(); if (do_halt_poll) { ktime_t stop = ktime_add_ns(start, vcpu->halt_poll_ns); @@ -3545,17 +3550,16 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu) if (halt_poll_allowed) { if (!vcpu_valid_wakeup(vcpu)) { shrink_halt_poll_ns(vcpu); - } else if (vcpu->kvm->max_halt_poll_ns) { + } else if (max_halt_poll_ns) { if (halt_ns <= vcpu->halt_poll_ns) ; /* we had a long block, shrink polling */ - else if (vcpu->halt_poll_ns && - halt_ns > vcpu->kvm->max_halt_poll_ns) + else if (vcpu->halt_poll_ns && halt_ns > max_halt_poll_ns) shrink_halt_poll_ns(vcpu); /* we had a short halt and our poll time is too small */ - else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns && - halt_ns < vcpu->kvm->max_halt_poll_ns) - grow_halt_poll_ns(vcpu); + else if (vcpu->halt_poll_ns < max_halt_poll_ns && + halt_ns < max_halt_poll_ns) + grow_halt_poll_ns(vcpu, max_halt_poll_ns); } else { vcpu->halt_poll_ns = 0; } @@ -4588,6 +4592,7 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm, if (cap->flags || cap->args[0] != (unsigned int)cap->args[0]) return -EINVAL; + kvm->override_halt_poll_ns = true; kvm->max_halt_poll_ns = cap->args[0]; return 0; } > > On 2021/9/28 1:33, Sean Christopherson wrote: > > On Mon, Sep 27, 2021, Paolo Bonzini wrote: > > > On Mon, Sep 27, 2021 at 5:17 PM Christian Borntraeger > > > wrote: > > > > > So I think there are two possibilities that makes sense: > > > > > > > > > > * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that > > > > what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did. > > > Yes, that's what I meant. David pointed out that doesn't allow you to > > > disable halt polling altogether, but for that you can always ask each > > > VM's userspace one by one, or just not use KVM_CAP_HALT_POLL. (Also, I > > > don't know about Google's usecase, but mine was actually more about > > > using KVM_CAP_HALT_POLL to *disable* halt polling on some VMs!). > > I kinda like the idea if special-casing halt_poll_ns=0, e.g. for testing or > > in-the-field mitigation if halt-polling is broken. It'd be trivial to support, e.g. > Do we have any plan to repost the diff as a fix? > I would be very nice that this issue can be solved. > > Besides, I think we may need some Doc for users to describe > how halt_poll_ns works with KVM_CAP_HALT_POLL, like > "Documentation/virt/guest-halt-polling.rst". > > @@ -3304,19 +3304,23 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu) > > update_halt_poll_stats(vcpu, start, poll_end, !waited); > > > > if (halt_poll_allowed) { > > + max_halt_poll_ns = vcpu->kvm->max_halt_poll_ns; > > + if (!max_halt_poll_ns || !halt_poll_ns) <------ squish the max if halt_poll_ns==0 > > + max_halt_poll_ns = halt_poll_ns; > > + > Does this mean that KVM_CAP_HALT_POLL will not be able to > disable halt polling for a VM individually when halt_poll_ns !=0? > > if (!vcpu_valid_wakeup(vcpu)) { > > shrink_halt_poll_ns(vcpu); > > - } else if (vcpu->kvm->max_halt_poll_ns) { > > + } else if (max_halt_poll_ns) { > > if (halt_ns <= vcpu->halt_poll_ns) > > ; > > /* we had a long block, shrink polling */ > > else if (vcpu->halt_poll_ns && > > - halt_ns > vcpu->kvm->max_halt_poll_ns) > > + halt_ns > max_halt_poll_ns) > > shrink_halt_poll_ns(vcpu); > > /* we had a short halt and our poll time is too small */ > > - else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns && > > - halt_ns < vcpu->kvm->max_halt_poll_ns) > > - grow_halt_poll_ns(vcpu); > > + else if (vcpu->halt_poll_ns < max_halt_poll_ns && > > + halt_ns < max_halt_poll_ns) > > + grow_halt_poll_ns(vcpu, max_halt_poll_ns); > > } else { > > vcpu->halt_poll_ns = 0; > > } > > _______________________________________________ > > kvmarm mailing list > > kvmarm@lists.cs.columbia.edu > > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm > > . > Thanks, > Yanan _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Matlack Date: Wed, 16 Nov 2022 17:19:28 +0000 Subject: Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a ne Message-Id: List-Id: References: <20210925005528.1145584-1-seanjc@google.com> <03f2f5ab-e809-2ba5-bd98-3393c3b843d2@de.ibm.com> <43e42f5c-9d9f-9e8b-3a61-9a053a818250@de.ibm.com> <32810c89-44c6-6780-9d05-e49f6b897b6e@huawei.com> In-Reply-To: <32810c89-44c6-6780-9d05-e49f6b897b6e@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: "wangyanan (Y)" Cc: Sean Christopherson , Paolo Bonzini , Wanpeng Li , kvm , David Hildenbrand , "Kernel Mailing List, Linux" , Paul Mackerras , Claudio Imbrenda , KVM ARM , Janosch Frank , Marc Zyngier , Joerg Roedel , Huacai Chen , Christian Borntraeger , Aleksandar Markovic , Jon Cargille , kvm-ppc , linux-arm-kernel , Jim Mattson , Cornelia Huck , "open list:MIPS" , Vitaly Kuznetsov On Tue, Nov 15, 2022 at 11:28:56AM +0800, wangyanan (Y) wrote: > Hi Sean, Paolo, > > I recently also notice the behavior change of param halt_poll_ns. > Now it loses the ability to: > 1) dynamically disable halt polling for all the running VMs > by `echo 0 > /sys` > 2) dynamically adjust the halt polling interval for all the > running VMs by `echo * > /sys` > > While in our cases, we usually use above two abilities, and > KVM_CAP_HALT_POLL is not used yet. I think the right path forward is to make KVM_CAP_HALT_POLL a pure override of halt_poll_ns, and restore the pre-existing behavior of halt_poll_ns whenever KVM_CAP_HALT_POLL is not used. e.g. see the patch below. That will fix issues (1) and (2) above for any VM not using KVM_CAP_HALT_POLL. If a VM is using KVM_CAP_HALT_POLL, it will ignore all changes to halt_poll_ns. If we truly need a mechanism for admins to disable halt-polling on VMs using KVM_CAP_HALT_POLL, we can introduce a separate module parameter for that. But IMO, any setup that is sophisticated enough to use KVM_CAP_HALT_POLL should also be able to use KVM_CAP_HALT_POLL to disable halt polling. If everyone is happy with this approach I can test and send a real patch to the mailing list. diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index e6e66c5e56f2..253ad055b6ad 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -788,6 +788,7 @@ struct kvm { struct srcu_struct srcu; struct srcu_struct irq_srcu; pid_t userspace_pid; + bool override_halt_poll_ns; unsigned int max_halt_poll_ns; u32 dirty_ring_size; bool vm_bugged; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 43bbe4fde078..479d0d0da0b5 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1198,8 +1198,6 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname) goto out_err_no_arch_destroy_vm; } - kvm->max_halt_poll_ns = halt_poll_ns; - r = kvm_arch_init_vm(kvm, type); if (r) goto out_err_no_arch_destroy_vm; @@ -3371,7 +3369,7 @@ void kvm_sigset_deactivate(struct kvm_vcpu *vcpu) sigemptyset(¤t->real_blocked); } -static void grow_halt_poll_ns(struct kvm_vcpu *vcpu) +static void grow_halt_poll_ns(struct kvm_vcpu *vcpu, unsigned int max) { unsigned int old, val, grow, grow_start; @@ -3385,8 +3383,8 @@ static void grow_halt_poll_ns(struct kvm_vcpu *vcpu) if (val < grow_start) val = grow_start; - if (val > vcpu->kvm->max_halt_poll_ns) - val = vcpu->kvm->max_halt_poll_ns; + if (val > max) + val = max; vcpu->halt_poll_ns = val; out: @@ -3501,10 +3499,17 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu) { bool halt_poll_allowed = !kvm_arch_no_poll(vcpu); bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns; + unsigned int max_halt_poll_ns; ktime_t start, cur, poll_end; + struct kvm *kvm = vcpu->kvm; bool waited = false; u64 halt_ns; + if (kvm->override_halt_poll_ns) + max_halt_poll_ns = kvm->max_halt_poll_ns; + else + max_halt_poll_ns = READ_ONCE(halt_poll_ns); + start = cur = poll_end = ktime_get(); if (do_halt_poll) { ktime_t stop = ktime_add_ns(start, vcpu->halt_poll_ns); @@ -3545,17 +3550,16 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu) if (halt_poll_allowed) { if (!vcpu_valid_wakeup(vcpu)) { shrink_halt_poll_ns(vcpu); - } else if (vcpu->kvm->max_halt_poll_ns) { + } else if (max_halt_poll_ns) { if (halt_ns <= vcpu->halt_poll_ns) ; /* we had a long block, shrink polling */ - else if (vcpu->halt_poll_ns && - halt_ns > vcpu->kvm->max_halt_poll_ns) + else if (vcpu->halt_poll_ns && halt_ns > max_halt_poll_ns) shrink_halt_poll_ns(vcpu); /* we had a short halt and our poll time is too small */ - else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns && - halt_ns < vcpu->kvm->max_halt_poll_ns) - grow_halt_poll_ns(vcpu); + else if (vcpu->halt_poll_ns < max_halt_poll_ns && + halt_ns < max_halt_poll_ns) + grow_halt_poll_ns(vcpu, max_halt_poll_ns); } else { vcpu->halt_poll_ns = 0; } @@ -4588,6 +4592,7 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm, if (cap->flags || cap->args[0] != (unsigned int)cap->args[0]) return -EINVAL; + kvm->override_halt_poll_ns = true; kvm->max_halt_poll_ns = cap->args[0]; return 0; } > > On 2021/9/28 1:33, Sean Christopherson wrote: > > On Mon, Sep 27, 2021, Paolo Bonzini wrote: > > > On Mon, Sep 27, 2021 at 5:17 PM Christian Borntraeger > > > wrote: > > > > > So I think there are two possibilities that makes sense: > > > > > > > > > > * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that > > > > what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did. > > > Yes, that's what I meant. David pointed out that doesn't allow you to > > > disable halt polling altogether, but for that you can always ask each > > > VM's userspace one by one, or just not use KVM_CAP_HALT_POLL. (Also, I > > > don't know about Google's usecase, but mine was actually more about > > > using KVM_CAP_HALT_POLL to *disable* halt polling on some VMs!). > > I kinda like the idea if special-casing halt_poll_ns=0, e.g. for testing or > > in-the-field mitigation if halt-polling is broken. It'd be trivial to support, e.g. > Do we have any plan to repost the diff as a fix? > I would be very nice that this issue can be solved. > > Besides, I think we may need some Doc for users to describe > how halt_poll_ns works with KVM_CAP_HALT_POLL, like > "Documentation/virt/guest-halt-polling.rst". > > @@ -3304,19 +3304,23 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu) > > update_halt_poll_stats(vcpu, start, poll_end, !waited); > > > > if (halt_poll_allowed) { > > + max_halt_poll_ns = vcpu->kvm->max_halt_poll_ns; > > + if (!max_halt_poll_ns || !halt_poll_ns) <------ squish the max if halt_poll_ns=0 > > + max_halt_poll_ns = halt_poll_ns; > > + > Does this mean that KVM_CAP_HALT_POLL will not be able to > disable halt polling for a VM individually when halt_poll_ns !=0? > > if (!vcpu_valid_wakeup(vcpu)) { > > shrink_halt_poll_ns(vcpu); > > - } else if (vcpu->kvm->max_halt_poll_ns) { > > + } else if (max_halt_poll_ns) { > > if (halt_ns <= vcpu->halt_poll_ns) > > ; > > /* we had a long block, shrink polling */ > > else if (vcpu->halt_poll_ns && > > - halt_ns > vcpu->kvm->max_halt_poll_ns) > > + halt_ns > max_halt_poll_ns) > > shrink_halt_poll_ns(vcpu); > > /* we had a short halt and our poll time is too small */ > > - else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns && > > - halt_ns < vcpu->kvm->max_halt_poll_ns) > > - grow_halt_poll_ns(vcpu); > > + else if (vcpu->halt_poll_ns < max_halt_poll_ns && > > + halt_ns < max_halt_poll_ns) > > + grow_halt_poll_ns(vcpu, max_halt_poll_ns); > > } else { > > vcpu->halt_poll_ns = 0; > > } > > _______________________________________________ > > kvmarm mailing list > > kvmarm@lists.cs.columbia.edu > > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm > > . > Thanks, > Yanan