From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 0B0D5C43217
	for <linux-kernel@archiver.kernel.org>; Wed, 16 Nov 2022 17:19:51 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S234752AbiKPRTt (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Wed, 16 Nov 2022 12:19:49 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36856 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S233809AbiKPRTi (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 16 Nov 2022 12:19:38 -0500
Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B3DA91E8
        for <linux-kernel@vger.kernel.org>; Wed, 16 Nov 2022 09:19:33 -0800 (PST)
Received: by mail-pf1-x432.google.com with SMTP id 140so16631387pfz.6
        for <linux-kernel@vger.kernel.org>; Wed, 16 Nov 2022 09:19:33 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=in-reply-to:content-disposition:mime-version:references:message-id
         :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to;
        bh=yd++ObN+WrxyzCV6eu+IuUH2K33aygMDqpLl7YqP9bQ=;
        b=K5O1OhO9WbRzwyTmhYztgpVTLFrj0jLsQw/X0Y2COhlWg/KCpnb3BuQ/3DWYU/QUyy
         yn6MmJVPG4DpxPGCodjTtHXveMagw2rprK5uLT+OwUN4vOCoVyvMx/lewpri+A2NUKw7
         IATvHrc/GE8CCvbn/3FH8sh8WbpgcGvRCKKhDX9YmnXDsYO8cpZ4JyKmwtEMBqutYZ14
         Ff2cuZRp8Ae8g/u5S2WKIjd493CjipYO/FbPKGh3i4ORxyz3HFK4UsZLQAkDnl2UWzVP
         4w3mooHuX4QS++G0LYjLJnrSgQXT6koq357Hksp4BgysnMT4kd4YH7s6t6H6qB7ovrYS
         2LaQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=in-reply-to:content-disposition:mime-version:references:message-id
         :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date
         :message-id:reply-to;
        bh=yd++ObN+WrxyzCV6eu+IuUH2K33aygMDqpLl7YqP9bQ=;
        b=GR0Te3xljwpJBNyoonF70vsOaIM3LPNKLDBGHN3rj0xASyQUDF8ZR9kmrFwftG6GEw
         45sFVP8vSi2H56zD2jp7TDmVMjXLh11WawrsravuMN33AVMDI8tkk4Zf6seRsLf0eRQo
         63wcJpelw9i0Ajet7SIXzxMIinpnPXt4blPckuEBGwnzQiPof46ZgwAvic1kO8zDN9o9
         kuqbv41+QYH4FiMZ4QAja6AwhJ38vPvTPPf0HWOEhOxjmsoD/ybumfyo7Embt8hfM7yO
         frg17xikhMgM2FaNmjls6f2NhQw+Wnj5g3cttzIS+mg9LpyuZbqnTX5EGzmCmrES9lE6
         +PBQ==
X-Gm-Message-State: ANoB5plC/gudTEmwQIOvBS4lZ8uRzueR8pz5ntfJ/cYrYlb45YdzRwdm
        UzjECzmgaiCxgQ6h3FEHk0w/2Q==
X-Google-Smtp-Source: AA0mqf7vO0cqTPp+H1qbd2Dp5/UFD3iMJrWGOOmHhwF4ArV24KELQAGgbhwMNKl1+aZ5S+A+ooIaeA==
X-Received: by 2002:a63:221a:0:b0:464:3985:3c63 with SMTP id i26-20020a63221a000000b0046439853c63mr21258516pgi.141.1668619173006;
        Wed, 16 Nov 2022 09:19:33 -0800 (PST)
Received: from google.com (223.103.125.34.bc.googleusercontent.com. [34.125.103.223])
        by smtp.gmail.com with ESMTPSA id k26-20020aa7999a000000b00561382a5a25sm11102299pfh.26.2022.11.16.09.19.31
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Wed, 16 Nov 2022 09:19:32 -0800 (PST)
Date:   Wed, 16 Nov 2022 09:19:28 -0800
From:   David Matlack <dmatlack@google.com>
To:     "wangyanan (Y)" <wangyanan55@huawei.com>
Cc:     Sean Christopherson <seanjc@google.com>,
        Paolo Bonzini <pbonzini@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>, kvm <kvm@vger.kernel.org>,
        David Hildenbrand <david@redhat.com>,
        "Kernel Mailing List, Linux" <linux-kernel@vger.kernel.org>,
        Paul Mackerras <paulus@ozlabs.org>,
        Claudio Imbrenda <imbrenda@linux.ibm.com>,
        KVM ARM <kvmarm@lists.cs.columbia.edu>,
        Janosch Frank <frankja@linux.ibm.com>,
        Marc Zyngier <maz@kernel.org>, Joerg Roedel <joro@8bytes.org>,
        Huacai Chen <chenhuacai@kernel.org>,
        Christian Borntraeger <borntraeger@de.ibm.com>,
        Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Jon Cargille <jcargill@google.com>,
        kvm-ppc <kvm-ppc@vger.kernel.org>,
        linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
        Jim Mattson <jmattson@google.com>,
        Cornelia Huck <cohuck@redhat.com>,
        "open list:MIPS" <linux-mips@vger.kernel.org>,
        Vitaly Kuznetsov <vkuznets@redhat.com>
Subject: Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM:
 Halt-polling fixes, cleanups and a new stat)
Message-ID: <Y3UboELxugwDJkIG@google.com>
References: <20210925005528.1145584-1-seanjc@google.com>
 <03f2f5ab-e809-2ba5-bd98-3393c3b843d2@de.ibm.com>
 <YVHcY6y1GmvGJnMg@google.com>
 <f37ab68c-61ce-b6fb-7a49-831bacfc7424@redhat.com>
 <43e42f5c-9d9f-9e8b-3a61-9a053a818250@de.ibm.com>
 <CABgObfYtS6wiQe=BhF3t5usr7J6q4PWE4=rwZMMukfC9wT_6fA@mail.gmail.com>
 <YVIAdVxc+q2UWB+J@google.com>
 <32810c89-44c6-6780-9d05-e49f6b897b6e@huawei.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <32810c89-44c6-6780-9d05-e49f6b897b6e@huawei.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Tue, Nov 15, 2022 at 11:28:56AM +0800, wangyanan (Y) wrote:
> Hi Sean, Paolo,
> 
> I recently also notice the behavior change of param halt_poll_ns.
> Now it loses the ability to:
> 1) dynamically disable halt polling for all the running VMs
> by `echo 0 > /sys`
> 2) dynamically adjust the halt polling interval for all the
> running VMs by `echo * > /sys`
> 
> While in our cases, we usually use above two abilities, and
> KVM_CAP_HALT_POLL is not used yet.

I think the right path forward is to make KVM_CAP_HALT_POLL a pure
override of halt_poll_ns, and restore the pre-existing behavior of
halt_poll_ns whenever KVM_CAP_HALT_POLL is not used. e.g. see the patch
below.

That will fix issues (1) and (2) above for any VM not using
KVM_CAP_HALT_POLL. If a VM is using KVM_CAP_HALT_POLL, it will ignore
all changes to halt_poll_ns. If we truly need a mechanism for admins to
disable halt-polling on VMs using KVM_CAP_HALT_POLL, we can introduce a
separate module parameter for that. But IMO, any setup that is
sophisticated enough to use KVM_CAP_HALT_POLL should also be able to use
KVM_CAP_HALT_POLL to disable halt polling.

If everyone is happy with this approach I can test and send a real patch
to the mailing list.

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e6e66c5e56f2..253ad055b6ad 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -788,6 +788,7 @@ struct kvm {
 	struct srcu_struct srcu;
 	struct srcu_struct irq_srcu;
 	pid_t userspace_pid;
+	bool override_halt_poll_ns;
 	unsigned int max_halt_poll_ns;
 	u32 dirty_ring_size;
 	bool vm_bugged;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 43bbe4fde078..479d0d0da0b5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1198,8 +1198,6 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
 			goto out_err_no_arch_destroy_vm;
 	}
 
-	kvm->max_halt_poll_ns = halt_poll_ns;
-
 	r = kvm_arch_init_vm(kvm, type);
 	if (r)
 		goto out_err_no_arch_destroy_vm;
@@ -3371,7 +3369,7 @@ void kvm_sigset_deactivate(struct kvm_vcpu *vcpu)
 	sigemptyset(&current->real_blocked);
 }
 
-static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
+static void grow_halt_poll_ns(struct kvm_vcpu *vcpu, unsigned int max)
 {
 	unsigned int old, val, grow, grow_start;
 
@@ -3385,8 +3383,8 @@ static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
 	if (val < grow_start)
 		val = grow_start;
 
-	if (val > vcpu->kvm->max_halt_poll_ns)
-		val = vcpu->kvm->max_halt_poll_ns;
+	if (val > max)
+		val = max;
 
 	vcpu->halt_poll_ns = val;
 out:
@@ -3501,10 +3499,17 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 {
 	bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
 	bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns;
+	unsigned int max_halt_poll_ns;
 	ktime_t start, cur, poll_end;
+	struct kvm *kvm = vcpu->kvm;
 	bool waited = false;
 	u64 halt_ns;
 
+	if (kvm->override_halt_poll_ns)
+		max_halt_poll_ns = kvm->max_halt_poll_ns;
+	else
+		max_halt_poll_ns = READ_ONCE(halt_poll_ns);
+
 	start = cur = poll_end = ktime_get();
 	if (do_halt_poll) {
 		ktime_t stop = ktime_add_ns(start, vcpu->halt_poll_ns);
@@ -3545,17 +3550,16 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 	if (halt_poll_allowed) {
 		if (!vcpu_valid_wakeup(vcpu)) {
 			shrink_halt_poll_ns(vcpu);
-		} else if (vcpu->kvm->max_halt_poll_ns) {
+		} else if (max_halt_poll_ns) {
 			if (halt_ns <= vcpu->halt_poll_ns)
 				;
 			/* we had a long block, shrink polling */
-			else if (vcpu->halt_poll_ns &&
-				 halt_ns > vcpu->kvm->max_halt_poll_ns)
+			else if (vcpu->halt_poll_ns && halt_ns > max_halt_poll_ns)
 				shrink_halt_poll_ns(vcpu);
 			/* we had a short halt and our poll time is too small */
-			else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
-				 halt_ns < vcpu->kvm->max_halt_poll_ns)
-				grow_halt_poll_ns(vcpu);
+			else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
+				 halt_ns < max_halt_poll_ns)
+				grow_halt_poll_ns(vcpu, max_halt_poll_ns);
 		} else {
 			vcpu->halt_poll_ns = 0;
 		}
@@ -4588,6 +4592,7 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm,
 		if (cap->flags || cap->args[0] != (unsigned int)cap->args[0])
 			return -EINVAL;
 
+		kvm->override_halt_poll_ns = true;
 		kvm->max_halt_poll_ns = cap->args[0];
 		return 0;
 	}

> 
> On 2021/9/28 1:33, Sean Christopherson wrote:
> > On Mon, Sep 27, 2021, Paolo Bonzini wrote:
> > > On Mon, Sep 27, 2021 at 5:17 PM Christian Borntraeger
> > > <borntraeger@de.ibm.com> wrote:
> > > > > So I think there are two possibilities that makes sense:
> > > > > 
> > > > > * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that
> > > > what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did.
> > > Yes, that's what I meant.  David pointed out that doesn't allow you to
> > > disable halt polling altogether, but for that you can always ask each
> > > VM's userspace one by one, or just not use KVM_CAP_HALT_POLL. (Also, I
> > > don't know about Google's usecase, but mine was actually more about
> > > using KVM_CAP_HALT_POLL to *disable* halt polling on some VMs!).
> > I kinda like the idea if special-casing halt_poll_ns=0, e.g. for testing or
> > in-the-field mitigation if halt-polling is broken.  It'd be trivial to support, e.g.
> Do we have any plan to repost the diff as a fix?
> I would be very nice that this issue can be solved.
> 
> Besides, I think we may need some Doc for users to describe
> how halt_poll_ns works with KVM_CAP_HALT_POLL, like
> "Documentation/virt/guest-halt-polling.rst".
> > @@ -3304,19 +3304,23 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
> >                  update_halt_poll_stats(vcpu, start, poll_end, !waited);
> > 
> >          if (halt_poll_allowed) {
> > +               max_halt_poll_ns = vcpu->kvm->max_halt_poll_ns;
> > +               if (!max_halt_poll_ns || !halt_poll_ns)  <------ squish the max if halt_poll_ns==0
> > +                       max_halt_poll_ns = halt_poll_ns;
> > +
> Does this mean that KVM_CAP_HALT_POLL will not be able to
> disable halt polling for a VM individually when halt_poll_ns !=0?
> >                  if (!vcpu_valid_wakeup(vcpu)) {
> >                          shrink_halt_poll_ns(vcpu);
> > -               } else if (vcpu->kvm->max_halt_poll_ns) {
> > +               } else if (max_halt_poll_ns) {
> >                          if (halt_ns <= vcpu->halt_poll_ns)
> >                                  ;
> >                          /* we had a long block, shrink polling */
> >                          else if (vcpu->halt_poll_ns &&
> > -                                halt_ns > vcpu->kvm->max_halt_poll_ns)
> > +                                halt_ns > max_halt_poll_ns)
> >                                  shrink_halt_poll_ns(vcpu);
> >                          /* we had a short halt and our poll time is too small */
> > -                       else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
> > -                                halt_ns < vcpu->kvm->max_halt_poll_ns)
> > -                               grow_halt_poll_ns(vcpu);
> > +                       else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
> > +                                halt_ns < max_halt_poll_ns)
> > +                               grow_halt_poll_ns(vcpu, max_halt_poll_ns);
> >                  } else {
> >                          vcpu->halt_poll_ns = 0;
> >                  }
> > _______________________________________________
> > kvmarm mailing list
> > kvmarm@lists.cs.columbia.edu
> > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> > .
> Thanks,
> Yanan

From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <kvmarm-bounces@lists.cs.columbia.edu>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from mm01.cs.columbia.edu (mm01.cs.columbia.edu [128.59.11.253])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 3FA5CC43217
	for <kvmarm@archiver.kernel.org>; Wed, 16 Nov 2022 17:19:39 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
	by mm01.cs.columbia.edu (Postfix) with ESMTP id BC1AE4B852;
	Wed, 16 Nov 2022 12:19:38 -0500 (EST)
X-Virus-Scanned: at lists.cs.columbia.edu
Authentication-Results: mm01.cs.columbia.edu (amavisd-new); dkim=softfail
	(fail, message has been altered) header.i=@google.com
Received: from mm01.cs.columbia.edu ([127.0.0.1])
	by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id aeM8L6G4Pr5b; Wed, 16 Nov 2022 12:19:37 -0500 (EST)
Received: from mm01.cs.columbia.edu (localhost [127.0.0.1])
	by mm01.cs.columbia.edu (Postfix) with ESMTP id 5783D4B85B;
	Wed, 16 Nov 2022 12:19:37 -0500 (EST)
Received: from localhost (localhost [127.0.0.1])
 by mm01.cs.columbia.edu (Postfix) with ESMTP id C3C654B631
 for <kvmarm@lists.cs.columbia.edu>; Wed, 16 Nov 2022 12:19:35 -0500 (EST)
X-Virus-Scanned: at lists.cs.columbia.edu
Received: from mm01.cs.columbia.edu ([127.0.0.1])
 by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id TIHG+fHln7ck for <kvmarm@lists.cs.columbia.edu>;
 Wed, 16 Nov 2022 12:19:34 -0500 (EST)
Received: from mail-pf1-f178.google.com (mail-pf1-f178.google.com
 [209.85.210.178])
 by mm01.cs.columbia.edu (Postfix) with ESMTPS id 44B814B2FF
 for <kvmarm@lists.cs.columbia.edu>; Wed, 16 Nov 2022 12:19:34 -0500 (EST)
Received: by mail-pf1-f178.google.com with SMTP id k15so18110008pfg.2
 for <kvmarm@lists.cs.columbia.edu>; Wed, 16 Nov 2022 09:19:34 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112;
 h=in-reply-to:content-disposition:mime-version:references:message-id
 :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to;
 bh=yd++ObN+WrxyzCV6eu+IuUH2K33aygMDqpLl7YqP9bQ=;
 b=K5O1OhO9WbRzwyTmhYztgpVTLFrj0jLsQw/X0Y2COhlWg/KCpnb3BuQ/3DWYU/QUyy
 yn6MmJVPG4DpxPGCodjTtHXveMagw2rprK5uLT+OwUN4vOCoVyvMx/lewpri+A2NUKw7
 IATvHrc/GE8CCvbn/3FH8sh8WbpgcGvRCKKhDX9YmnXDsYO8cpZ4JyKmwtEMBqutYZ14
 Ff2cuZRp8Ae8g/u5S2WKIjd493CjipYO/FbPKGh3i4ORxyz3HFK4UsZLQAkDnl2UWzVP
 4w3mooHuX4QS++G0LYjLJnrSgQXT6koq357Hksp4BgysnMT4kd4YH7s6t6H6qB7ovrYS
 2LaQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=in-reply-to:content-disposition:mime-version:references:message-id
 :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date
 :message-id:reply-to;
 bh=yd++ObN+WrxyzCV6eu+IuUH2K33aygMDqpLl7YqP9bQ=;
 b=aGIJXxm3boR7y9wje7Wv0ZxtFqEUUpqMUv3Msl+Cf3M7fAfr2IGOWchnNH7BOLzy06
 Sf8/lhSn9gVdeOqN6S5y+UBbnHN+2+fU7nM8uMV2WTja5mV5BqrRmm07Vzx2jGDgGQss
 Np4Xmy2a61hsal0E1fnS0SeUI9ciGBK0CeH+RPQmd/i8/X9xBav0RVkQyG1PAAyZYT7t
 B2OOR8m/bMLfj7I3LNb4+ZxdPnzJxgx2mCse4of6juvNJy7P0c36AAQdNwhlHKckVA38
 JXK19PBX8PTDJ8AukorWKs0HFyUe0KydUbG0965jleUNJeYJv/OxqlM6C0TnGGPTwwch
 5dRw==
X-Gm-Message-State: ANoB5pnypesv9NDyXRbMpn8UV2qVbs6bhtcfCY2eGTw1m57zha8P2RJ+
 Yilz+FSKtS8RSx6te28LzAyOoA==
X-Google-Smtp-Source: AA0mqf7vO0cqTPp+H1qbd2Dp5/UFD3iMJrWGOOmHhwF4ArV24KELQAGgbhwMNKl1+aZ5S+A+ooIaeA==
X-Received: by 2002:a63:221a:0:b0:464:3985:3c63 with SMTP id
 i26-20020a63221a000000b0046439853c63mr21258516pgi.141.1668619173006; 
 Wed, 16 Nov 2022 09:19:33 -0800 (PST)
Received: from google.com (223.103.125.34.bc.googleusercontent.com.
 [34.125.103.223]) by smtp.gmail.com with ESMTPSA id
 k26-20020aa7999a000000b00561382a5a25sm11102299pfh.26.2022.11.16.09.19.31
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Wed, 16 Nov 2022 09:19:32 -0800 (PST)
Date: Wed, 16 Nov 2022 09:19:28 -0800
From: David Matlack <dmatlack@google.com>
To: "wangyanan (Y)" <wangyanan55@huawei.com>
Subject: Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM:
 Halt-polling fixes, cleanups and a new stat)
Message-ID: <Y3UboELxugwDJkIG@google.com>
References: <20210925005528.1145584-1-seanjc@google.com>
 <03f2f5ab-e809-2ba5-bd98-3393c3b843d2@de.ibm.com>
 <YVHcY6y1GmvGJnMg@google.com>
 <f37ab68c-61ce-b6fb-7a49-831bacfc7424@redhat.com>
 <43e42f5c-9d9f-9e8b-3a61-9a053a818250@de.ibm.com>
 <CABgObfYtS6wiQe=BhF3t5usr7J6q4PWE4=rwZMMukfC9wT_6fA@mail.gmail.com>
 <YVIAdVxc+q2UWB+J@google.com>
 <32810c89-44c6-6780-9d05-e49f6b897b6e@huawei.com>
MIME-Version: 1.0
Content-Disposition: inline
In-Reply-To: <32810c89-44c6-6780-9d05-e49f6b897b6e@huawei.com>
Cc: Wanpeng Li <wanpengli@tencent.com>, kvm <kvm@vger.kernel.org>,
 David Hildenbrand <david@redhat.com>,
 "open list:MIPS" <linux-mips@vger.kernel.org>,
 Paul Mackerras <paulus@ozlabs.org>, Claudio Imbrenda <imbrenda@linux.ibm.com>,
 KVM ARM <kvmarm@lists.cs.columbia.edu>, Janosch Frank <frankja@linux.ibm.com>,
 Marc Zyngier <maz@kernel.org>, Joerg Roedel <joro@8bytes.org>,
 Huacai Chen <chenhuacai@kernel.org>,
 Christian Borntraeger <borntraeger@de.ibm.com>,
 Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
 Jon Cargille <jcargill@google.com>, kvm-ppc <kvm-ppc@vger.kernel.org>,
 linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
 Jim Mattson <jmattson@google.com>, Cornelia Huck <cohuck@redhat.com>,
 "Kernel Mailing List, Linux" <linux-kernel@vger.kernel.org>,
 Paolo Bonzini <pbonzini@redhat.com>, Vitaly Kuznetsov <vkuznets@redhat.com>
X-BeenThere: kvmarm@lists.cs.columbia.edu
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Where KVM/ARM decisions are made <kvmarm.lists.cs.columbia.edu>
List-Unsubscribe: <https://lists.cs.columbia.edu/mailman/options/kvmarm>,
 <mailto:kvmarm-request@lists.cs.columbia.edu?subject=unsubscribe>
List-Archive: <https://lists.cs.columbia.edu/pipermail/kvmarm>
List-Post: <mailto:kvmarm@lists.cs.columbia.edu>
List-Help: <mailto:kvmarm-request@lists.cs.columbia.edu?subject=help>
List-Subscribe: <https://lists.cs.columbia.edu/mailman/listinfo/kvmarm>,
 <mailto:kvmarm-request@lists.cs.columbia.edu?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: kvmarm-bounces@lists.cs.columbia.edu
Sender: kvmarm-bounces@lists.cs.columbia.edu

On Tue, Nov 15, 2022 at 11:28:56AM +0800, wangyanan (Y) wrote:
> Hi Sean, Paolo,
> 
> I recently also notice the behavior change of param halt_poll_ns.
> Now it loses the ability to:
> 1) dynamically disable halt polling for all the running VMs
> by `echo 0 > /sys`
> 2) dynamically adjust the halt polling interval for all the
> running VMs by `echo * > /sys`
> 
> While in our cases, we usually use above two abilities, and
> KVM_CAP_HALT_POLL is not used yet.

I think the right path forward is to make KVM_CAP_HALT_POLL a pure
override of halt_poll_ns, and restore the pre-existing behavior of
halt_poll_ns whenever KVM_CAP_HALT_POLL is not used. e.g. see the patch
below.

That will fix issues (1) and (2) above for any VM not using
KVM_CAP_HALT_POLL. If a VM is using KVM_CAP_HALT_POLL, it will ignore
all changes to halt_poll_ns. If we truly need a mechanism for admins to
disable halt-polling on VMs using KVM_CAP_HALT_POLL, we can introduce a
separate module parameter for that. But IMO, any setup that is
sophisticated enough to use KVM_CAP_HALT_POLL should also be able to use
KVM_CAP_HALT_POLL to disable halt polling.

If everyone is happy with this approach I can test and send a real patch
to the mailing list.

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e6e66c5e56f2..253ad055b6ad 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -788,6 +788,7 @@ struct kvm {
 	struct srcu_struct srcu;
 	struct srcu_struct irq_srcu;
 	pid_t userspace_pid;
+	bool override_halt_poll_ns;
 	unsigned int max_halt_poll_ns;
 	u32 dirty_ring_size;
 	bool vm_bugged;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 43bbe4fde078..479d0d0da0b5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1198,8 +1198,6 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
 			goto out_err_no_arch_destroy_vm;
 	}
 
-	kvm->max_halt_poll_ns = halt_poll_ns;
-
 	r = kvm_arch_init_vm(kvm, type);
 	if (r)
 		goto out_err_no_arch_destroy_vm;
@@ -3371,7 +3369,7 @@ void kvm_sigset_deactivate(struct kvm_vcpu *vcpu)
 	sigemptyset(&current->real_blocked);
 }
 
-static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
+static void grow_halt_poll_ns(struct kvm_vcpu *vcpu, unsigned int max)
 {
 	unsigned int old, val, grow, grow_start;
 
@@ -3385,8 +3383,8 @@ static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
 	if (val < grow_start)
 		val = grow_start;
 
-	if (val > vcpu->kvm->max_halt_poll_ns)
-		val = vcpu->kvm->max_halt_poll_ns;
+	if (val > max)
+		val = max;
 
 	vcpu->halt_poll_ns = val;
 out:
@@ -3501,10 +3499,17 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 {
 	bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
 	bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns;
+	unsigned int max_halt_poll_ns;
 	ktime_t start, cur, poll_end;
+	struct kvm *kvm = vcpu->kvm;
 	bool waited = false;
 	u64 halt_ns;
 
+	if (kvm->override_halt_poll_ns)
+		max_halt_poll_ns = kvm->max_halt_poll_ns;
+	else
+		max_halt_poll_ns = READ_ONCE(halt_poll_ns);
+
 	start = cur = poll_end = ktime_get();
 	if (do_halt_poll) {
 		ktime_t stop = ktime_add_ns(start, vcpu->halt_poll_ns);
@@ -3545,17 +3550,16 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 	if (halt_poll_allowed) {
 		if (!vcpu_valid_wakeup(vcpu)) {
 			shrink_halt_poll_ns(vcpu);
-		} else if (vcpu->kvm->max_halt_poll_ns) {
+		} else if (max_halt_poll_ns) {
 			if (halt_ns <= vcpu->halt_poll_ns)
 				;
 			/* we had a long block, shrink polling */
-			else if (vcpu->halt_poll_ns &&
-				 halt_ns > vcpu->kvm->max_halt_poll_ns)
+			else if (vcpu->halt_poll_ns && halt_ns > max_halt_poll_ns)
 				shrink_halt_poll_ns(vcpu);
 			/* we had a short halt and our poll time is too small */
-			else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
-				 halt_ns < vcpu->kvm->max_halt_poll_ns)
-				grow_halt_poll_ns(vcpu);
+			else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
+				 halt_ns < max_halt_poll_ns)
+				grow_halt_poll_ns(vcpu, max_halt_poll_ns);
 		} else {
 			vcpu->halt_poll_ns = 0;
 		}
@@ -4588,6 +4592,7 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm,
 		if (cap->flags || cap->args[0] != (unsigned int)cap->args[0])
 			return -EINVAL;
 
+		kvm->override_halt_poll_ns = true;
 		kvm->max_halt_poll_ns = cap->args[0];
 		return 0;
 	}

> 
> On 2021/9/28 1:33, Sean Christopherson wrote:
> > On Mon, Sep 27, 2021, Paolo Bonzini wrote:
> > > On Mon, Sep 27, 2021 at 5:17 PM Christian Borntraeger
> > > <borntraeger@de.ibm.com> wrote:
> > > > > So I think there are two possibilities that makes sense:
> > > > > 
> > > > > * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that
> > > > what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did.
> > > Yes, that's what I meant.  David pointed out that doesn't allow you to
> > > disable halt polling altogether, but for that you can always ask each
> > > VM's userspace one by one, or just not use KVM_CAP_HALT_POLL. (Also, I
> > > don't know about Google's usecase, but mine was actually more about
> > > using KVM_CAP_HALT_POLL to *disable* halt polling on some VMs!).
> > I kinda like the idea if special-casing halt_poll_ns=0, e.g. for testing or
> > in-the-field mitigation if halt-polling is broken.  It'd be trivial to support, e.g.
> Do we have any plan to repost the diff as a fix?
> I would be very nice that this issue can be solved.
> 
> Besides, I think we may need some Doc for users to describe
> how halt_poll_ns works with KVM_CAP_HALT_POLL, like
> "Documentation/virt/guest-halt-polling.rst".
> > @@ -3304,19 +3304,23 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
> >                  update_halt_poll_stats(vcpu, start, poll_end, !waited);
> > 
> >          if (halt_poll_allowed) {
> > +               max_halt_poll_ns = vcpu->kvm->max_halt_poll_ns;
> > +               if (!max_halt_poll_ns || !halt_poll_ns)  <------ squish the max if halt_poll_ns==0
> > +                       max_halt_poll_ns = halt_poll_ns;
> > +
> Does this mean that KVM_CAP_HALT_POLL will not be able to
> disable halt polling for a VM individually when halt_poll_ns !=0?
> >                  if (!vcpu_valid_wakeup(vcpu)) {
> >                          shrink_halt_poll_ns(vcpu);
> > -               } else if (vcpu->kvm->max_halt_poll_ns) {
> > +               } else if (max_halt_poll_ns) {
> >                          if (halt_ns <= vcpu->halt_poll_ns)
> >                                  ;
> >                          /* we had a long block, shrink polling */
> >                          else if (vcpu->halt_poll_ns &&
> > -                                halt_ns > vcpu->kvm->max_halt_poll_ns)
> > +                                halt_ns > max_halt_poll_ns)
> >                                  shrink_halt_poll_ns(vcpu);
> >                          /* we had a short halt and our poll time is too small */
> > -                       else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
> > -                                halt_ns < vcpu->kvm->max_halt_poll_ns)
> > -                               grow_halt_poll_ns(vcpu);
> > +                       else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
> > +                                halt_ns < max_halt_poll_ns)
> > +                               grow_halt_poll_ns(vcpu, max_halt_poll_ns);
> >                  } else {
> >                          vcpu->halt_poll_ns = 0;
> >                  }
> > _______________________________________________
> > kvmarm mailing list
> > kvmarm@lists.cs.columbia.edu
> > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> > .
> Thanks,
> Yanan
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.lore.kernel.org (Postfix) with ESMTPS id 8786FC4332F
	for <linux-arm-kernel@archiver.kernel.org>; Wed, 16 Nov 2022 17:20:53 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
	d=lists.infradead.org; s=bombadil.20210309; h=Sender:
	Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post:
	List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:
	Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description:
	Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:
	List-Owner; bh=omnPGczk59hQQpxrS7uce4kwsF6TlYJWgpxRQfTGDZU=; b=4+16/kmUkYHuf9
	H3vLq4ZSYgA8ja/vGuyT5Ro28ceM8NyYDXMi5LrJXKN5iYRU5COG+6KTac4b5kRMgmf7CwTRAww9e
	ITpjMpclzyiXdGW2xXa9iD2QwhYnyKlfVnvV4W36atFAgTc3dIAd1lUFx+Fu9ft//cUCbD/nu3s/q
	+9Ork9a9kb9XRmW/6LcyauCNKBgrZuJKCdnVqLFWDzIeCX/uKs3g5yIqoj7La3fQ1iCVoTChVPfge
	5qRMMcZRXDgiX+4TnI3tEdVrVo/xYvRGE7HiU78H4Y6UbmLycLCxuEfyiLmjko0KJOYQRkcX/4GO4
	YT+NhZDESAopnnXQDEwQ==;
Received: from localhost ([::1] helo=bombadil.infradead.org)
	by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux))
	id 1ovM4c-006N8G-NF; Wed, 16 Nov 2022 17:19:42 +0000
Received: from mail-pf1-x434.google.com ([2607:f8b0:4864:20::434])
	by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux))
	id 1ovM4X-006N5G-Q2
	for linux-arm-kernel@lists.infradead.org; Wed, 16 Nov 2022 17:19:41 +0000
Received: by mail-pf1-x434.google.com with SMTP id y203so18107415pfb.4
        for <linux-arm-kernel@lists.infradead.org>; Wed, 16 Nov 2022 09:19:33 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=in-reply-to:content-disposition:mime-version:references:message-id
         :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to;
        bh=yd++ObN+WrxyzCV6eu+IuUH2K33aygMDqpLl7YqP9bQ=;
        b=K5O1OhO9WbRzwyTmhYztgpVTLFrj0jLsQw/X0Y2COhlWg/KCpnb3BuQ/3DWYU/QUyy
         yn6MmJVPG4DpxPGCodjTtHXveMagw2rprK5uLT+OwUN4vOCoVyvMx/lewpri+A2NUKw7
         IATvHrc/GE8CCvbn/3FH8sh8WbpgcGvRCKKhDX9YmnXDsYO8cpZ4JyKmwtEMBqutYZ14
         Ff2cuZRp8Ae8g/u5S2WKIjd493CjipYO/FbPKGh3i4ORxyz3HFK4UsZLQAkDnl2UWzVP
         4w3mooHuX4QS++G0LYjLJnrSgQXT6koq357Hksp4BgysnMT4kd4YH7s6t6H6qB7ovrYS
         2LaQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=in-reply-to:content-disposition:mime-version:references:message-id
         :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date
         :message-id:reply-to;
        bh=yd++ObN+WrxyzCV6eu+IuUH2K33aygMDqpLl7YqP9bQ=;
        b=QgdxTvmOgVM+rchFC7fuWVhSgTo9nTqXy6POafdytq1HlxiGvm/9aDQK67V97ZdrvP
         pJ7Nbw7RQDscUSow5IHy0RpuXaiJUJQ9S/gkwVZgE6dMT4tixRl+mW2zjUrgXj5RFjrs
         PoBN6QhqICEMuHhJfuNIy+Q87gWjl+vQfeDJUjtN/3eeienmpWR/BXLA/y8VCvrJ+rXq
         VfpdQ9feNRL7GcW2YoYo/VwKBhgmpP9BrXC0PxirGW03NZx77oZvACPV6tviTrJdafLL
         p2shga+KjIbmfMjAy6qWz9svZ+swM5cc8475hhj5gecf25SJv5zVPPT91ACPojgLH1+z
         RRxg==
X-Gm-Message-State: ANoB5pkxN+2QFi5mtFss4y8bkc/EVrcjpn53Au+UvgHfeWOhvG37+l4G
	FAM3LGEUp5cVWyGfXnpk4FUMDA==
X-Google-Smtp-Source: AA0mqf7vO0cqTPp+H1qbd2Dp5/UFD3iMJrWGOOmHhwF4ArV24KELQAGgbhwMNKl1+aZ5S+A+ooIaeA==
X-Received: by 2002:a63:221a:0:b0:464:3985:3c63 with SMTP id i26-20020a63221a000000b0046439853c63mr21258516pgi.141.1668619173006;
        Wed, 16 Nov 2022 09:19:33 -0800 (PST)
Received: from google.com (223.103.125.34.bc.googleusercontent.com. [34.125.103.223])
        by smtp.gmail.com with ESMTPSA id k26-20020aa7999a000000b00561382a5a25sm11102299pfh.26.2022.11.16.09.19.31
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Wed, 16 Nov 2022 09:19:32 -0800 (PST)
Date: Wed, 16 Nov 2022 09:19:28 -0800
From: David Matlack <dmatlack@google.com>
To: "wangyanan (Y)" <wangyanan55@huawei.com>
Cc: Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>, kvm <kvm@vger.kernel.org>,
	David Hildenbrand <david@redhat.com>,
	"Kernel Mailing List, Linux" <linux-kernel@vger.kernel.org>,
	Paul Mackerras <paulus@ozlabs.org>,
	Claudio Imbrenda <imbrenda@linux.ibm.com>,
	KVM ARM <kvmarm@lists.cs.columbia.edu>,
	Janosch Frank <frankja@linux.ibm.com>,
	Marc Zyngier <maz@kernel.org>, Joerg Roedel <joro@8bytes.org>,
	Huacai Chen <chenhuacai@kernel.org>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
	Jon Cargille <jcargill@google.com>,
	kvm-ppc <kvm-ppc@vger.kernel.org>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	Jim Mattson <jmattson@google.com>,
	Cornelia Huck <cohuck@redhat.com>,
	"open list:MIPS" <linux-mips@vger.kernel.org>,
	Vitaly Kuznetsov <vkuznets@redhat.com>
Subject: Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM:
 Halt-polling fixes, cleanups and a new stat)
Message-ID: <Y3UboELxugwDJkIG@google.com>
References: <20210925005528.1145584-1-seanjc@google.com>
 <03f2f5ab-e809-2ba5-bd98-3393c3b843d2@de.ibm.com>
 <YVHcY6y1GmvGJnMg@google.com>
 <f37ab68c-61ce-b6fb-7a49-831bacfc7424@redhat.com>
 <43e42f5c-9d9f-9e8b-3a61-9a053a818250@de.ibm.com>
 <CABgObfYtS6wiQe=BhF3t5usr7J6q4PWE4=rwZMMukfC9wT_6fA@mail.gmail.com>
 <YVIAdVxc+q2UWB+J@google.com>
 <32810c89-44c6-6780-9d05-e49f6b897b6e@huawei.com>
MIME-Version: 1.0
Content-Disposition: inline
In-Reply-To: <32810c89-44c6-6780-9d05-e49f6b897b6e@huawei.com>
X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 
X-CRM114-CacheID: sfid-20221116_091937_871434_C1421130 
X-CRM114-Status: GOOD (  46.04  )
X-BeenThere: linux-arm-kernel@lists.infradead.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: <linux-arm-kernel.lists.infradead.org>
List-Unsubscribe: <http://lists.infradead.org/mailman/options/linux-arm-kernel>,
 <mailto:linux-arm-kernel-request@lists.infradead.org?subject=unsubscribe>
List-Archive: <http://lists.infradead.org/pipermail/linux-arm-kernel/>
List-Post: <mailto:linux-arm-kernel@lists.infradead.org>
List-Help: <mailto:linux-arm-kernel-request@lists.infradead.org?subject=help>
List-Subscribe: <http://lists.infradead.org/mailman/listinfo/linux-arm-kernel>,
 <mailto:linux-arm-kernel-request@lists.infradead.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org

On Tue, Nov 15, 2022 at 11:28:56AM +0800, wangyanan (Y) wrote:
> Hi Sean, Paolo,
> 
> I recently also notice the behavior change of param halt_poll_ns.
> Now it loses the ability to:
> 1) dynamically disable halt polling for all the running VMs
> by `echo 0 > /sys`
> 2) dynamically adjust the halt polling interval for all the
> running VMs by `echo * > /sys`
> 
> While in our cases, we usually use above two abilities, and
> KVM_CAP_HALT_POLL is not used yet.

I think the right path forward is to make KVM_CAP_HALT_POLL a pure
override of halt_poll_ns, and restore the pre-existing behavior of
halt_poll_ns whenever KVM_CAP_HALT_POLL is not used. e.g. see the patch
below.

That will fix issues (1) and (2) above for any VM not using
KVM_CAP_HALT_POLL. If a VM is using KVM_CAP_HALT_POLL, it will ignore
all changes to halt_poll_ns. If we truly need a mechanism for admins to
disable halt-polling on VMs using KVM_CAP_HALT_POLL, we can introduce a
separate module parameter for that. But IMO, any setup that is
sophisticated enough to use KVM_CAP_HALT_POLL should also be able to use
KVM_CAP_HALT_POLL to disable halt polling.

If everyone is happy with this approach I can test and send a real patch
to the mailing list.

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e6e66c5e56f2..253ad055b6ad 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -788,6 +788,7 @@ struct kvm {
 	struct srcu_struct srcu;
 	struct srcu_struct irq_srcu;
 	pid_t userspace_pid;
+	bool override_halt_poll_ns;
 	unsigned int max_halt_poll_ns;
 	u32 dirty_ring_size;
 	bool vm_bugged;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 43bbe4fde078..479d0d0da0b5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1198,8 +1198,6 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
 			goto out_err_no_arch_destroy_vm;
 	}
 
-	kvm->max_halt_poll_ns = halt_poll_ns;
-
 	r = kvm_arch_init_vm(kvm, type);
 	if (r)
 		goto out_err_no_arch_destroy_vm;
@@ -3371,7 +3369,7 @@ void kvm_sigset_deactivate(struct kvm_vcpu *vcpu)
 	sigemptyset(&current->real_blocked);
 }
 
-static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
+static void grow_halt_poll_ns(struct kvm_vcpu *vcpu, unsigned int max)
 {
 	unsigned int old, val, grow, grow_start;
 
@@ -3385,8 +3383,8 @@ static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
 	if (val < grow_start)
 		val = grow_start;
 
-	if (val > vcpu->kvm->max_halt_poll_ns)
-		val = vcpu->kvm->max_halt_poll_ns;
+	if (val > max)
+		val = max;
 
 	vcpu->halt_poll_ns = val;
 out:
@@ -3501,10 +3499,17 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 {
 	bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
 	bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns;
+	unsigned int max_halt_poll_ns;
 	ktime_t start, cur, poll_end;
+	struct kvm *kvm = vcpu->kvm;
 	bool waited = false;
 	u64 halt_ns;
 
+	if (kvm->override_halt_poll_ns)
+		max_halt_poll_ns = kvm->max_halt_poll_ns;
+	else
+		max_halt_poll_ns = READ_ONCE(halt_poll_ns);
+
 	start = cur = poll_end = ktime_get();
 	if (do_halt_poll) {
 		ktime_t stop = ktime_add_ns(start, vcpu->halt_poll_ns);
@@ -3545,17 +3550,16 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 	if (halt_poll_allowed) {
 		if (!vcpu_valid_wakeup(vcpu)) {
 			shrink_halt_poll_ns(vcpu);
-		} else if (vcpu->kvm->max_halt_poll_ns) {
+		} else if (max_halt_poll_ns) {
 			if (halt_ns <= vcpu->halt_poll_ns)
 				;
 			/* we had a long block, shrink polling */
-			else if (vcpu->halt_poll_ns &&
-				 halt_ns > vcpu->kvm->max_halt_poll_ns)
+			else if (vcpu->halt_poll_ns && halt_ns > max_halt_poll_ns)
 				shrink_halt_poll_ns(vcpu);
 			/* we had a short halt and our poll time is too small */
-			else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
-				 halt_ns < vcpu->kvm->max_halt_poll_ns)
-				grow_halt_poll_ns(vcpu);
+			else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
+				 halt_ns < max_halt_poll_ns)
+				grow_halt_poll_ns(vcpu, max_halt_poll_ns);
 		} else {
 			vcpu->halt_poll_ns = 0;
 		}
@@ -4588,6 +4592,7 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm,
 		if (cap->flags || cap->args[0] != (unsigned int)cap->args[0])
 			return -EINVAL;
 
+		kvm->override_halt_poll_ns = true;
 		kvm->max_halt_poll_ns = cap->args[0];
 		return 0;
 	}

> 
> On 2021/9/28 1:33, Sean Christopherson wrote:
> > On Mon, Sep 27, 2021, Paolo Bonzini wrote:
> > > On Mon, Sep 27, 2021 at 5:17 PM Christian Borntraeger
> > > <borntraeger@de.ibm.com> wrote:
> > > > > So I think there are two possibilities that makes sense:
> > > > > 
> > > > > * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that
> > > > what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did.
> > > Yes, that's what I meant.  David pointed out that doesn't allow you to
> > > disable halt polling altogether, but for that you can always ask each
> > > VM's userspace one by one, or just not use KVM_CAP_HALT_POLL. (Also, I
> > > don't know about Google's usecase, but mine was actually more about
> > > using KVM_CAP_HALT_POLL to *disable* halt polling on some VMs!).
> > I kinda like the idea if special-casing halt_poll_ns=0, e.g. for testing or
> > in-the-field mitigation if halt-polling is broken.  It'd be trivial to support, e.g.
> Do we have any plan to repost the diff as a fix?
> I would be very nice that this issue can be solved.
> 
> Besides, I think we may need some Doc for users to describe
> how halt_poll_ns works with KVM_CAP_HALT_POLL, like
> "Documentation/virt/guest-halt-polling.rst".
> > @@ -3304,19 +3304,23 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
> >                  update_halt_poll_stats(vcpu, start, poll_end, !waited);
> > 
> >          if (halt_poll_allowed) {
> > +               max_halt_poll_ns = vcpu->kvm->max_halt_poll_ns;
> > +               if (!max_halt_poll_ns || !halt_poll_ns)  <------ squish the max if halt_poll_ns==0
> > +                       max_halt_poll_ns = halt_poll_ns;
> > +
> Does this mean that KVM_CAP_HALT_POLL will not be able to
> disable halt polling for a VM individually when halt_poll_ns !=0?
> >                  if (!vcpu_valid_wakeup(vcpu)) {
> >                          shrink_halt_poll_ns(vcpu);
> > -               } else if (vcpu->kvm->max_halt_poll_ns) {
> > +               } else if (max_halt_poll_ns) {
> >                          if (halt_ns <= vcpu->halt_poll_ns)
> >                                  ;
> >                          /* we had a long block, shrink polling */
> >                          else if (vcpu->halt_poll_ns &&
> > -                                halt_ns > vcpu->kvm->max_halt_poll_ns)
> > +                                halt_ns > max_halt_poll_ns)
> >                                  shrink_halt_poll_ns(vcpu);
> >                          /* we had a short halt and our poll time is too small */
> > -                       else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
> > -                                halt_ns < vcpu->kvm->max_halt_poll_ns)
> > -                               grow_halt_poll_ns(vcpu);
> > +                       else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
> > +                                halt_ns < max_halt_poll_ns)
> > +                               grow_halt_poll_ns(vcpu, max_halt_poll_ns);
> >                  } else {
> >                          vcpu->halt_poll_ns = 0;
> >                  }
> > _______________________________________________
> > kvmarm mailing list
> > kvmarm@lists.cs.columbia.edu
> > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> > .
> Thanks,
> Yanan

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

From mboxrd@z Thu Jan  1 00:00:00 1970
From: David Matlack <dmatlack@google.com>
Date: Wed, 16 Nov 2022 17:19:28 +0000
Subject: Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a ne
Message-Id: <Y3UboELxugwDJkIG@google.com>
List-Id: <kvm-ppc.vger.kernel.org>
References: <20210925005528.1145584-1-seanjc@google.com>
 <03f2f5ab-e809-2ba5-bd98-3393c3b843d2@de.ibm.com>
 <YVHcY6y1GmvGJnMg@google.com>
 <f37ab68c-61ce-b6fb-7a49-831bacfc7424@redhat.com>
 <43e42f5c-9d9f-9e8b-3a61-9a053a818250@de.ibm.com>
 <CABgObfYtS6wiQe=BhF3t5usr7J6q4PWE4=rwZMMukfC9wT_6fA@mail.gmail.com>
 <YVIAdVxc+q2UWB+J@google.com>
 <32810c89-44c6-6780-9d05-e49f6b897b6e@huawei.com>
In-Reply-To: <32810c89-44c6-6780-9d05-e49f6b897b6e@huawei.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
To: "wangyanan (Y)" <wangyanan55@huawei.com>
Cc: Sean Christopherson <seanjc@google.com>, Paolo Bonzini <pbonzini@redhat.com>, Wanpeng Li <wanpengli@tencent.com>, kvm <kvm@vger.kernel.org>, David Hildenbrand <david@redhat.com>, "Kernel Mailing List, Linux" <linux-kernel@vger.kernel.org>, Paul Mackerras <paulus@ozlabs.org>, Claudio Imbrenda <imbrenda@linux.ibm.com>, KVM ARM <kvmarm@lists.cs.columbia.edu>, Janosch Frank <frankja@linux.ibm.com>, Marc Zyngier <maz@kernel.org>, Joerg Roedel <joro@8bytes.org>, Huacai Chen <chenhuacai@kernel.org>, Christian Borntraeger <borntraeger@de.ibm.com>, Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>, Jon Cargille <jcargill@google.com>, kvm-ppc <kvm-ppc@vger.kernel.org>, linux-arm-kernel <linux-arm-kernel@lists.infradead.org>, Jim Mattson <jmattson@google.com>, Cornelia Huck <cohuck@redhat.com>, "open list:MIPS" <linux-mips@vger.kernel.org>, Vitaly Kuznetsov <vkuznets@redhat.com>

On Tue, Nov 15, 2022 at 11:28:56AM +0800, wangyanan (Y) wrote:
> Hi Sean, Paolo,
> 
> I recently also notice the behavior change of param halt_poll_ns.
> Now it loses the ability to:
> 1) dynamically disable halt polling for all the running VMs
> by `echo 0 > /sys`
> 2) dynamically adjust the halt polling interval for all the
> running VMs by `echo * > /sys`
> 
> While in our cases, we usually use above two abilities, and
> KVM_CAP_HALT_POLL is not used yet.

I think the right path forward is to make KVM_CAP_HALT_POLL a pure
override of halt_poll_ns, and restore the pre-existing behavior of
halt_poll_ns whenever KVM_CAP_HALT_POLL is not used. e.g. see the patch
below.

That will fix issues (1) and (2) above for any VM not using
KVM_CAP_HALT_POLL. If a VM is using KVM_CAP_HALT_POLL, it will ignore
all changes to halt_poll_ns. If we truly need a mechanism for admins to
disable halt-polling on VMs using KVM_CAP_HALT_POLL, we can introduce a
separate module parameter for that. But IMO, any setup that is
sophisticated enough to use KVM_CAP_HALT_POLL should also be able to use
KVM_CAP_HALT_POLL to disable halt polling.

If everyone is happy with this approach I can test and send a real patch
to the mailing list.

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e6e66c5e56f2..253ad055b6ad 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -788,6 +788,7 @@ struct kvm {
 	struct srcu_struct srcu;
 	struct srcu_struct irq_srcu;
 	pid_t userspace_pid;
+	bool override_halt_poll_ns;
 	unsigned int max_halt_poll_ns;
 	u32 dirty_ring_size;
 	bool vm_bugged;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 43bbe4fde078..479d0d0da0b5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1198,8 +1198,6 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
 			goto out_err_no_arch_destroy_vm;
 	}
 
-	kvm->max_halt_poll_ns = halt_poll_ns;
-
 	r = kvm_arch_init_vm(kvm, type);
 	if (r)
 		goto out_err_no_arch_destroy_vm;
@@ -3371,7 +3369,7 @@ void kvm_sigset_deactivate(struct kvm_vcpu *vcpu)
 	sigemptyset(&current->real_blocked);
 }
 
-static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
+static void grow_halt_poll_ns(struct kvm_vcpu *vcpu, unsigned int max)
 {
 	unsigned int old, val, grow, grow_start;
 
@@ -3385,8 +3383,8 @@ static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
 	if (val < grow_start)
 		val = grow_start;
 
-	if (val > vcpu->kvm->max_halt_poll_ns)
-		val = vcpu->kvm->max_halt_poll_ns;
+	if (val > max)
+		val = max;
 
 	vcpu->halt_poll_ns = val;
 out:
@@ -3501,10 +3499,17 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 {
 	bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
 	bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns;
+	unsigned int max_halt_poll_ns;
 	ktime_t start, cur, poll_end;
+	struct kvm *kvm = vcpu->kvm;
 	bool waited = false;
 	u64 halt_ns;
 
+	if (kvm->override_halt_poll_ns)
+		max_halt_poll_ns = kvm->max_halt_poll_ns;
+	else
+		max_halt_poll_ns = READ_ONCE(halt_poll_ns);
+
 	start = cur = poll_end = ktime_get();
 	if (do_halt_poll) {
 		ktime_t stop = ktime_add_ns(start, vcpu->halt_poll_ns);
@@ -3545,17 +3550,16 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 	if (halt_poll_allowed) {
 		if (!vcpu_valid_wakeup(vcpu)) {
 			shrink_halt_poll_ns(vcpu);
-		} else if (vcpu->kvm->max_halt_poll_ns) {
+		} else if (max_halt_poll_ns) {
 			if (halt_ns <= vcpu->halt_poll_ns)
 				;
 			/* we had a long block, shrink polling */
-			else if (vcpu->halt_poll_ns &&
-				 halt_ns > vcpu->kvm->max_halt_poll_ns)
+			else if (vcpu->halt_poll_ns && halt_ns > max_halt_poll_ns)
 				shrink_halt_poll_ns(vcpu);
 			/* we had a short halt and our poll time is too small */
-			else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
-				 halt_ns < vcpu->kvm->max_halt_poll_ns)
-				grow_halt_poll_ns(vcpu);
+			else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
+				 halt_ns < max_halt_poll_ns)
+				grow_halt_poll_ns(vcpu, max_halt_poll_ns);
 		} else {
 			vcpu->halt_poll_ns = 0;
 		}
@@ -4588,6 +4592,7 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm,
 		if (cap->flags || cap->args[0] != (unsigned int)cap->args[0])
 			return -EINVAL;
 
+		kvm->override_halt_poll_ns = true;
 		kvm->max_halt_poll_ns = cap->args[0];
 		return 0;
 	}

> 
> On 2021/9/28 1:33, Sean Christopherson wrote:
> > On Mon, Sep 27, 2021, Paolo Bonzini wrote:
> > > On Mon, Sep 27, 2021 at 5:17 PM Christian Borntraeger
> > > <borntraeger@de.ibm.com> wrote:
> > > > > So I think there are two possibilities that makes sense:
> > > > > 
> > > > > * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that
> > > > what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did.
> > > Yes, that's what I meant.  David pointed out that doesn't allow you to
> > > disable halt polling altogether, but for that you can always ask each
> > > VM's userspace one by one, or just not use KVM_CAP_HALT_POLL. (Also, I
> > > don't know about Google's usecase, but mine was actually more about
> > > using KVM_CAP_HALT_POLL to *disable* halt polling on some VMs!).
> > I kinda like the idea if special-casing halt_poll_ns=0, e.g. for testing or
> > in-the-field mitigation if halt-polling is broken.  It'd be trivial to support, e.g.
> Do we have any plan to repost the diff as a fix?
> I would be very nice that this issue can be solved.
> 
> Besides, I think we may need some Doc for users to describe
> how halt_poll_ns works with KVM_CAP_HALT_POLL, like
> "Documentation/virt/guest-halt-polling.rst".
> > @@ -3304,19 +3304,23 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
> >                  update_halt_poll_stats(vcpu, start, poll_end, !waited);
> > 
> >          if (halt_poll_allowed) {
> > +               max_halt_poll_ns = vcpu->kvm->max_halt_poll_ns;
> > +               if (!max_halt_poll_ns || !halt_poll_ns)  <------ squish the max if halt_poll_ns=0
> > +                       max_halt_poll_ns = halt_poll_ns;
> > +
> Does this mean that KVM_CAP_HALT_POLL will not be able to
> disable halt polling for a VM individually when halt_poll_ns !=0?
> >                  if (!vcpu_valid_wakeup(vcpu)) {
> >                          shrink_halt_poll_ns(vcpu);
> > -               } else if (vcpu->kvm->max_halt_poll_ns) {
> > +               } else if (max_halt_poll_ns) {
> >                          if (halt_ns <= vcpu->halt_poll_ns)
> >                                  ;
> >                          /* we had a long block, shrink polling */
> >                          else if (vcpu->halt_poll_ns &&
> > -                                halt_ns > vcpu->kvm->max_halt_poll_ns)
> > +                                halt_ns > max_halt_poll_ns)
> >                                  shrink_halt_poll_ns(vcpu);
> >                          /* we had a short halt and our poll time is too small */
> > -                       else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
> > -                                halt_ns < vcpu->kvm->max_halt_poll_ns)
> > -                               grow_halt_poll_ns(vcpu);
> > +                       else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
> > +                                halt_ns < max_halt_poll_ns)
> > +                               grow_halt_poll_ns(vcpu, max_halt_poll_ns);
> >                  } else {
> >                          vcpu->halt_poll_ns = 0;
> >                  }
> > _______________________________________________
> > kvmarm mailing list
> > kvmarm@lists.cs.columbia.edu
> > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> > .
> Thanks,
> Yanan