From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29DDAC3A59F for ; Tue, 27 Aug 2019 00:42:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E2BBD20850 for ; Tue, 27 Aug 2019 00:42:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="qCkPAgRu" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728236AbfH0Amf (ORCPT ); Mon, 26 Aug 2019 20:42:35 -0400 Received: from mail-ot1-f66.google.com ([209.85.210.66]:42689 "EHLO mail-ot1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726307AbfH0Ame (ORCPT ); Mon, 26 Aug 2019 20:42:34 -0400 Received: by mail-ot1-f66.google.com with SMTP id j7so17036286ota.9; Mon, 26 Aug 2019 17:42:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=B546kMhKUy8z/D0dBPZWv2yG7srEpQ9xVtt3JcGHphA=; b=qCkPAgRu9ZQ9liv/liUzT76kXgEIuMmU55XZcELkLfQHEFMMN33BKJLNKhKq9N0rOn s/Nv3FFKF+9eZ6JBG9kzvo010hosCwInrSU8XaDJTe6YM/YXHDOsCwSkL7IeGrQ/3PlZ 3tky/wMW1h++0ObkUB3hxPXP5gx3eqJctCAY71Pd9/WuKjSfA1pCxwVimwBBVi910GUX 1LeOh7YKeQFwPCjixyJ20HoyJwxdyBIdvmhKripKj3tbE0BMh6+PEl0VoTp7JTfKr4Ar /xuskAYTxrRpGr61gdmn1TnAuEbk2nE2ZHMHoFNY7XfILQzknELuWz7JY9hASvI3o6b0 sE4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=B546kMhKUy8z/D0dBPZWv2yG7srEpQ9xVtt3JcGHphA=; b=c38r/rWttCdVz8oLeWZEML4ng6DGx1HbKQPLpJ17SrS8gpiE8EOQ74FKZDtHLUWwwb BPF38ksjCUGVFkqfIKozUj/BbdJqKQ5Kp/NfuyK/bEv1XEU+AbgJSX2NXwVkSaA/GPHA cb4K4NOBvNkFXMRNQHaiebmMYoREcRPdAVtBEZgyKtxXmlSyWLg0KNl9ZApypiGy5MsW l5bo4PpaPhmV1s103FGikun4xslE1X1SoGQQEmUQzcZntaMFwhZjrmwhvb+L3otegMhK VA551zrNRdyxgBGbm0ADBRKf+kN3PUmPErWfAxK/xB1iWBIoeYTfBaHvan8YLO8vDkFZ whjg== X-Gm-Message-State: APjAAAVQvgQd6WHmi5W56zFJnaa36FtIdcGZsjBAM2drdMXP/Gru+XpK K9Ca0xMeRqWZiip3IBtz3kx2hOfqnLspcO03p/GzGOyA X-Google-Smtp-Source: APXvYqxIHyQiKpnJItr9ft22ZV8g+cqcsYHCj7i+cceWeHSAe25OZkF+2FfCTwH+BpIsw+SthAIQtkdbz2o/e5smlu4= X-Received: by 2002:a05:6830:144b:: with SMTP id w11mr17874315otp.185.1566866553351; Mon, 26 Aug 2019 17:42:33 -0700 (PDT) MIME-Version: 1.0 References: <1564643196-7797-1-git-send-email-wanpengli@tencent.com> <7b1e3025-f513-7068-32ac-4830d67b65ac@intel.com> <20190803202058.GA9316@amt.cnet> <20190826204045.GA24697@amt.cnet> In-Reply-To: <20190826204045.GA24697@amt.cnet> From: Wanpeng Li Date: Tue, 27 Aug 2019 08:43:13 +0800 Message-ID: Subject: Re: [PATCH] cpuidle-haltpoll: Enable kvm guest polling when dedicated physical CPUs are available To: Marcelo Tosatti Cc: Paolo Bonzini , "Rafael J. Wysocki" , LKML , kvm , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , Linux PM , "Michael S. Tsirkin" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Cc Michael S. Tsirkin, On Tue, 27 Aug 2019 at 04:42, Marcelo Tosatti wrote: > > On Tue, Aug 13, 2019 at 08:55:29AM +0800, Wanpeng Li wrote: > > On Sun, 4 Aug 2019 at 04:21, Marcelo Tosatti wrot= e: > > > > > > On Thu, Aug 01, 2019 at 06:54:49PM +0200, Paolo Bonzini wrote: > > > > On 01/08/19 18:51, Rafael J. Wysocki wrote: > > > > > On 8/1/2019 9:06 AM, Wanpeng Li wrote: > > > > >> From: Wanpeng Li > > > > >> > > > > >> The downside of guest side polling is that polling is performed = even > > > > >> with other runnable tasks in the host. However, even if poll in = kvm > > > > >> can aware whether or not other runnable tasks in the same pCPU, = it > > > > >> can still incur extra overhead in over-subscribe scenario. Now w= e can > > > > >> just enable guest polling when dedicated pCPUs are available. > > > > >> > > > > >> Cc: Rafael J. Wysocki > > > > >> Cc: Paolo Bonzini > > > > >> Cc: Radim Kr=C4=8Dm=C3=A1=C5=99 > > > > >> Cc: Marcelo Tosatti > > > > >> Signed-off-by: Wanpeng Li > > > > > > > > > > Paolo, Marcelo, any comments? > > > > > > > > Yes, it's a good idea. > > > > > > > > Acked-by: Paolo Bonzini > > > > > > > > Paolo > > > > > > > Hi Marcelo, > > > > Sorry for the late response. > > > > > I think KVM_HINTS_REALTIME is being abused somewhat. > > > It has no clear meaning and used in different locations > > > for different purposes. > > > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > KVM_HINTS_REALTIME 0 guest checks this feature bit= to > > > > determine that vCPUs are never > > > > preempted for an unlimited time > > Unlimited time means infinite time, or unlimited time means > 10s ? 1s ? The former one I think. There is a discussion here https://lkml.org/lkml/2018/5/17/612 > > The previous definition was much better IMO: HINTS_DEDICATED. > > > > allowing optimizations > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > > > Now it disables pv queued spinlock, > > OK. > > > pv tlb shootdown, > > OK. > > > pv sched yield > > "The idea is from Xen, when sending a call-function IPI-many to vCPUs, > yield if any of the IPI target vCPUs was preempted. 17% performance > increasement of ebizzy benchmark can be observed in an over-subscribe > environment. (w/ kvm-pv-tlb disabled, testing TLB flush call-function > IPI-many since call-function is not easy to be trigged by userspace > workload)." > > This can probably hurt if vcpus are rarely preempted. That's why I add the KVM_HINTS_REALTIME checking here. > > > which are not expected present in vCPUs are never preempted for an > > unlimited time scenario. > > > > > > > > For example, i think that using pv queued spinlocks and > > > haltpoll is a desired scenario, which the patch below disallows. > > > > So even if dedicated pCPU is available, pv queued spinlocks should > > still be chose if something like vhost-kthreads are used instead of > > DPDK/vhost-user. > > Can't you enable the individual features you need for optimizing > the overcommitted case? This is how things have been done historically: > If a new feature is available, you enable it to get the desired > performance. x2apic, invariant-tsc, cpuidle haltpoll... > > So in your case: enable pv schedyield, enable pv tlb shootdown. Both of them are used to optimize function-call IPIs. pv sched yield for call function interrupts, and pv tlb shootdown for tlb invalidation. So still different here. Our latest testing against an 80 pCPUs host, and three 80 vCPUs VMs, the number is more better than 64 pCPUs host which I used when posting patches: ebizzy -M vanilla optimized boost 1VM 31234 34489 10% 2VM 5380 26664 396% 3VM 2967 23140 679% > > > kvm adaptive halt-polling will compete with > > vhost-kthreads, however, poll in guest unaware other runnable tasks in > > the host which will defeat vhost-kthreads. > > It depends on how much work vhost-kthreads needs to do, how successful > halt-poll in the guest is, and what improvement halt-polling brings. > The amount of polling will be reduced to zero if polling > is not successful. We observe vhost-kthreads compete with vCPUs adaptive halt-polling in kvm, it hurt performance in over-subscribe product environment, polling in guest can make it worse. Regards, Wanpeng Li