From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DBDEBC433F5 for ; Tue, 16 Nov 2021 09:31:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C563F63248 for ; Tue, 16 Nov 2021 09:31:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233132AbhKPJdz (ORCPT ); Tue, 16 Nov 2021 04:33:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55784 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233127AbhKPJdu (ORCPT ); Tue, 16 Nov 2021 04:33:50 -0500 Received: from mail-yb1-xb2b.google.com (mail-yb1-xb2b.google.com [IPv6:2607:f8b0:4864:20::b2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CF5A3C061764 for ; Tue, 16 Nov 2021 01:30:53 -0800 (PST) Received: by mail-yb1-xb2b.google.com with SMTP id v64so55580975ybi.5 for ; Tue, 16 Nov 2021 01:30:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=yNyi2RRQSAsxPbeJbzxrhSJQkKoSMh4ynwDBEELejP0=; b=f99TQ1VVzWFQfFEnJgH3L2TQl9Z9An6MytDZirJ0JiIsz36VVssawVRykd4oGEyMAe kT/zq60ajJZ6idJMMdBfzYsxk2S3MKX5qc4wKOy8rfre/mZfeAQpC8HDy0Yj7ygd3KxT 2gLs0opM8la+MfuHIZrHpNoOgjPijf7N3M5HrjjChF93KffCpGXj3vTShsUbRO28a+CW OBfMxZ2avEJCgIdL/3kVv8v76yJ0wGSAN+DS/q+GdCReWGRR3j5O05IqXQ52lcesT+kz 3qLzOhT3tdts3RUCPltIvZG3B9nRk8GEh5y1UMRdgd9HLi7hevrwo0qi/+s24hwDw44d v15Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=yNyi2RRQSAsxPbeJbzxrhSJQkKoSMh4ynwDBEELejP0=; b=T17R66SDSad+NS04mVplN7rHtGlRO/NUNLAmQSxrWBq3aeg0pXwHOhK8hETCHwJI7d 90eLm5Hc8RKKeJiogMDM3ZsOKeBncMn1gpgSUMfM65+X441rcrd/V+Ua4gK+6Ci5IPXG /kDEdrlfHjFK+Dbp8pMqkBDHE8aGlUFghlM3dTZRuAAZogQFrT8Z+OuL6Sywi6kW1ryx QHGvWcj262Pz5CPgZjY45veSxKwykt+kKX+w0VAVKKG5VwVMDgqrzmHGCJYROxJITr2I rcJT+AXSaKTUSEWDoEghEGTuGmwXxO79OS9yH95NQ2dURM8J4CjQIFjxezsqR6M6mRAn VvVg== X-Gm-Message-State: AOAM533xNXgCwW/AsUxEwupLuiExc8tGQxz308C2m+a+XeMEwaniPz1F RpBlqquq7UjzVk6QUn3pYecdDnNLxFJkToiTqxi3jA== X-Google-Smtp-Source: ABdhPJxF0F/WNM1NIUSbBqToQA0eVY5Gza1NTnyqGsuoEAtnDmo8yd+xw0l223mIxhw6kLlEm42Gqj30Ws/VRE79H9Y= X-Received: by 2002:a25:f20e:: with SMTP id i14mr6942380ybe.366.1637055053123; Tue, 16 Nov 2021 01:30:53 -0800 (PST) MIME-Version: 1.0 References: <20211108095931.618865-1-huangkele@bytedance.com> <20211116090604.GA12758@gao-cwp> In-Reply-To: <20211116090604.GA12758@gao-cwp> From: =?UTF-8?B?6buE56eR5LmQ?= Date: Tue, 16 Nov 2021 17:30:40 +0800 Message-ID: Subject: Re: [External] Re: Re: [RFC] KVM: x86: SVM: don't expose PV_SEND_IPI feature with AVIC To: Chao Gao Cc: zhenwei pi , Wanpeng Li , Maxim Levitsky , Paolo Bonzini , chaiwen.cc@bytedance.com, xieyongji@bytedance.com, dengliang.1214@bytedance.com, Wanpeng Li , Sean Christopherson , Vitaly Kuznetsov , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "the arch/x86 maintainers" , "H. Peter Anvin" , kvm , LKML Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org > The recently posted Intel IPI virtualization will accelerate unicast > ipi but not broadcast ipis, AMD AVIC accelerates unicast ipi well but > accelerates broadcast ipis worse than pv ipis. Could we just handle > unicast ipi here? Thanks for the explanation! It is true that AVIC does not always perform better than PV IPI, actually not even swx2apic. > So agree with Wanpeng's point, is it possible to separate single IPI and > broadcast IPI on a hardware acceleration platform? > how about just correcting the logic for xapic: > From 13447b221252b64cd85ed1329f7d917afa54efc8 Mon Sep 17 00:00:00 2001 > From: Jiaqing Zhao > Date: Fri, 9 Apr 2021 13:53:39 +0800 > Subject: [PATCH 1/2] x86/apic/flat: Add specific send IPI logic > Currently, apic_flat.send_IPI() uses default_send_IPI_single(), which > is a wrapper of apic->send_IPI_mask(). Since commit aaffcfd1e82d > ("KVM: X86: Implement PV IPIs in linux guest"), KVM PV IPI driver will > override apic->send_IPI_mask(), and may cause unwated side effects. > This patch removes such side effects by creating a specific send_IPI > method. > Signed-off-by: Jiaqing Zhao Actually, I think this issue is more about how to sort out the relationship between AVIC and PV IPI. As far as I understand, currently, no matter the option from userspace or the determination made in kernel works in some way, but not in the migration scenario. For instance, migration with AVIC feature changes can make guests lose the PV IPI feature needlessly. Besides, the current patch is not consistent with KVM_CAP_ENFORCE_PV_FEATURE_CPUID. Paolo's advice about using a new hint shall work well. Currently try working on it. Best regards, Kele On Tue, Nov 16, 2021 at 4:56 PM Chao Gao wrote: > > On Tue, Nov 16, 2021 at 10:56:25AM +0800, zhenwei pi wrote: > > > > > >On 11/16/21 10:48 AM, Wanpeng Li wrote: > >> On Mon, 8 Nov 2021 at 22:09, Maxim Levitsky wrote: > >> > > >> > On Mon, 2021-11-08 at 11:30 +0100, Paolo Bonzini wrote: > >> > > On 11/8/21 10:59, Kele Huang wrote: > >> > > > Currently, AVIC is disabled if x2apic feature is exposed to guest > >> > > > or in-kernel PIT is in re-injection mode. > >> > > > > >> > > > We can enable AVIC with options: > >> > > > > >> > > > Kmod args: > >> > > > modprobe kvm_amd avic=1 nested=0 npt=1 > >> > > > QEMU args: > >> > > > ... -cpu host,-x2apic -global kvm-pit.lost_tick_policy=discard ... > >> > > > > >> > > > When LAPIC works in xapic mode, both AVIC and PV_SEND_IPI feature > >> > > > can accelerate IPI operations for guest. However, the relationship > >> > > > between AVIC and PV_SEND_IPI feature is not sorted out. > >> > > > > >> > > > In logical, AVIC accelerates most of frequently IPI operations > >> > > > without VMM intervention, while the re-hooking of apic->send_IPI_xxx > >> > > > from PV_SEND_IPI feature masks out it. People can get confused > >> > > > if AVIC is enabled while getting lots of hypercall kvm_exits > >> > > > from IPI. > >> > > > > >> > > > In performance, benchmark tool > >> > > > https://lore.kernel.org/kvm/20171219085010.4081-1-ynorov@caviumnetworks.com/ > >> > > > shows below results: > >> > > > > >> > > > Test env: > >> > > > CPU: AMD EPYC 7742 64-Core Processor > >> > > > 2 vCPUs pinned 1:1 > >> > > > idle=poll > >> > > > > >> > > > Test result (average ns per IPI of lots of running): > >> > > > PV_SEND_IPI : 1860 > >> > > > AVIC : 1390 > >> > > > > >> > > > Besides, disscussions in https://lkml.org/lkml/2021/10/20/423 > >> > > > do have some solid performance test results to this. > >> > > > > >> > > > This patch fixes this by masking out PV_SEND_IPI feature when > >> > > > AVIC is enabled in setting up of guest vCPUs' CPUID. > >> > > > > >> > > > Signed-off-by: Kele Huang > >> > > > >> > > AVIC can change across migration. I think we should instead use a new > >> > > KVM_HINTS_* bit (KVM_HINTS_ACCELERATED_LAPIC or something like that). > >> > > The KVM_HINTS_* bits are intended to be changeable across migration, > >> > > even though we don't have for now anything equivalent to the Hyper-V > >> > > reenlightenment interrupt. > >> > > >> > Note that the same issue exists with HyperV. It also has PV APIC, > >> > which is harmful when AVIC is enabled (that is guest uses it instead > >> > of using AVIC, negating AVIC benefits). > >> > > >> > Also note that Intel recently posted IPI virtualizaion, which > >> > will make this issue relevant to APICv too soon. > >> > >> The recently posted Intel IPI virtualization will accelerate unicast > >> ipi but not broadcast ipis, AMD AVIC accelerates unicast ipi well but > >> accelerates broadcast ipis worse than pv ipis. Could we just handle > >> unicast ipi here? > >> > >> Wanpeng > >> > >Depend on the number of target vCPUs, broadcast IPIs gets unstable > >performance on AVIC, and usually worse than PV Send IPI. > >So agree with Wanpeng's point, is it possible to separate single IPI and > >broadcast IPI on a hardware acceleration platform? > > Actually, this is how kernel works in x2apic mode: use PV interface > (hypercall) to send multi-cast IPIs and write ICR MSR directly to send > unicast IPIs. > > But if guest works in xapic mode, both unicast and multi-cast are issued > via PV interface. It is a side-effect introduced by commit aaffcfd1e82d. > > how about just correcting the logic for xapic: > > From 13447b221252b64cd85ed1329f7d917afa54efc8 Mon Sep 17 00:00:00 2001 > From: Jiaqing Zhao > Date: Fri, 9 Apr 2021 13:53:39 +0800 > Subject: [PATCH 1/2] x86/apic/flat: Add specific send IPI logic > > Currently, apic_flat.send_IPI() uses default_send_IPI_single(), which > is a wrapper of apic->send_IPI_mask(). Since commit aaffcfd1e82d > ("KVM: X86: Implement PV IPIs in linux guest"), KVM PV IPI driver will > override apic->send_IPI_mask(), and may cause unwated side effects. > > This patch removes such side effects by creating a specific send_IPI > method. > > Signed-off-by: Jiaqing Zhao > --- > arch/x86/kernel/apic/apic_flat_64.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kernel/apic/apic_flat_64.c b/arch/x86/kernel/apic/apic_flat_64.c > index 8f72b4351c9f..3196bf220230 100644 > --- a/arch/x86/kernel/apic/apic_flat_64.c > +++ b/arch/x86/kernel/apic/apic_flat_64.c > @@ -64,6 +64,13 @@ static void flat_send_IPI_mask(const struct cpumask *cpumask, int vector) > _flat_send_IPI_mask(mask, vector); > } > > +static void flat_send_IPI_single(int cpu, int vector) > +{ > + unsigned long mask = cpumask_bits(cpumask_of(cpu))[0]; > + > + _flat_send_IPI_mask(mask, vector); > +} > + > static void > flat_send_IPI_mask_allbutself(const struct cpumask *cpumask, int vector) > { > @@ -132,7 +139,7 @@ static struct apic apic_flat __ro_after_init = { > > .calc_dest_apicid = apic_flat_calc_apicid, > > - .send_IPI = default_send_IPI_single, > + .send_IPI = flat_send_IPI_single, > .send_IPI_mask = flat_send_IPI_mask, > .send_IPI_mask_allbutself = flat_send_IPI_mask_allbutself, > .send_IPI_allbutself = default_send_IPI_allbutself, > -- > 2.27.0 >