From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=kpGn=KN=nongnu.org=qemu-devel-bounces+qemu-devel=archiver.kernel.org@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-15.3 required=3.0 tests=BAYES_00,
	HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH,
	MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1
	autolearn=ham autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id C89A2C433ED
	for <qemu-devel@archiver.kernel.org>; Tue, 18 May 2021 11:49:10 +0000 (UTC)
Received: from lists.gnu.org (lists.gnu.org [209.51.188.17])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by mail.kernel.org (Postfix) with ESMTPS id 39836611BF
	for <qemu-devel@archiver.kernel.org>; Tue, 18 May 2021 11:49:10 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 39836611BF
Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com
Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Received: from localhost ([::1]:46912 helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org>)
	id 1liyDl-0002OL-Ca
	for qemu-devel@archiver.kernel.org; Tue, 18 May 2021 07:49:09 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:38178)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <wangyanan55@huawei.com>)
 id 1liyCH-0000ey-C5; Tue, 18 May 2021 07:47:37 -0400
Received: from szxga07-in.huawei.com ([45.249.212.35]:3599)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <wangyanan55@huawei.com>)
 id 1liyCE-0007sc-EV; Tue, 18 May 2021 07:47:37 -0400
Received: from dggems704-chm.china.huawei.com (unknown [172.30.72.60])
 by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4FkvLJ4CvLzCtyG;
 Tue, 18 May 2021 19:44:40 +0800 (CST)
Received: from dggpemm500023.china.huawei.com (7.185.36.83) by
 dggems704-chm.china.huawei.com (10.3.19.181) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.2176.2; Tue, 18 May 2021 19:47:26 +0800
Received: from [10.174.187.128] (10.174.187.128) by
 dggpemm500023.china.huawei.com (7.185.36.83) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id
 15.1.2176.2; Tue, 18 May 2021 19:47:25 +0800
Subject: Re: [RFC PATCH v3 6/9] hw/arm/virt-acpi-build: Use possible cpus in
 generation of MADT
To: Andrew Jones <drjones@redhat.com>
References: <20210516102900.28036-1-wangyanan55@huawei.com>
 <20210516102900.28036-7-wangyanan55@huawei.com>
 <20210517074256.xjqwejbi4mfsvug2@gator.home>
 <ac1b0f17-523d-adb8-c4f4-aa5c93966726@huawei.com>
 <20210518081550.d3hof7jr5soeuwo5@gator.home>
From: "wangyanan (Y)" <wangyanan55@huawei.com>
Message-ID: <1a0a9dea-cd42-a2e5-6b1c-0055391d439f@huawei.com>
Date: Tue, 18 May 2021 19:47:24 +0800
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
 Thunderbird/78.4.0
MIME-Version: 1.0
In-Reply-To: <20210518081550.d3hof7jr5soeuwo5@gator.home>
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Transfer-Encoding: 7bit
Content-Language: en-US
X-Originating-IP: [10.174.187.128]
X-ClientProxiedBy: dggeme716-chm.china.huawei.com (10.1.199.112) To
 dggpemm500023.china.huawei.com (7.185.36.83)
X-CFilter-Loop: Reflected
Received-SPF: pass client-ip=45.249.212.35;
 envelope-from=wangyanan55@huawei.com; helo=szxga07-in.huawei.com
X-Spam_score_int: -41
X-Spam_score: -4.2
X-Spam_bar: ----
X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, NICE_REPLY_A=-0.001,
 RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001,
 SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: Barry Song <song.bao.hua@hisilicon.com>,
 Peter Maydell <peter.maydell@linaro.org>,
 "Michael S . Tsirkin" <mst@redhat.com>, wanghaibin.wang@huawei.com,
 zhukeqian1@huawei.com, qemu-devel@nongnu.org, yangyicong@huawei.com,
 Shannon Zhao <shannon.zhaosl@gmail.com>, qemu-arm@nongnu.org,
 Alistair Francis <alistair.francis@wdc.com>, prime.zeng@hisilicon.com,
 Paolo Bonzini <pbonzini@redhat.com>, yuzenghui@huawei.com,
 Igor Mammedov <imammedo@redhat.com>,
 =?UTF-8?Q?Philippe_Mathieu-Daud=c3=a9?= <philmd@redhat.com>,
 David Gibson <david@gibson.dropbear.id.au>
Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Sender: "Qemu-devel"
 <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org>

Hi Drew,

On 2021/5/18 16:15, Andrew Jones wrote:
> On Tue, May 18, 2021 at 12:27:59AM +0800, wangyanan (Y) wrote:
>> Hi Drew,
>>
>> On 2021/5/17 15:42, Andrew Jones wrote:
>>> On Sun, May 16, 2021 at 06:28:57PM +0800, Yanan Wang wrote:
>>>> When building ACPI tables regarding CPUs we should always build
>>>> them for the number of possible CPUs, not the number of present
>>>> CPUs. So we create gicc nodes in MADT for possible cpus and then
>>>> ensure only the present CPUs are marked ENABLED. Furthermore, it
>>>> also needed if we are going to support CPU hotplug in the future.
>>>>
>>>> Co-developed-by: Andrew Jones <drjones@redhat.com>
>>>> Signed-off-by: Andrew Jones <drjones@redhat.com>
>>>> Co-developed-by: Ying Fang <fangying1@huawei.com>
>>>> Signed-off-by: Ying Fang <fangying1@huawei.com>
>>>> Co-developed-by: Yanan Wang <wangyanan55@huawei.com>
>>>> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
>>>> ---
>>>>    hw/arm/virt-acpi-build.c | 29 +++++++++++++++++++++++++----
>>>>    1 file changed, 25 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>>>> index a2d8e87616..4d64aeb865 100644
>>>> --- a/hw/arm/virt-acpi-build.c
>>>> +++ b/hw/arm/virt-acpi-build.c
>>>> @@ -481,6 +481,9 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>>>>        const int *irqmap = vms->irqmap;
>>>>        AcpiMadtGenericDistributor *gicd;
>>>>        AcpiMadtGenericMsiFrame *gic_msi;
>>>> +    MachineClass *mc = MACHINE_GET_CLASS(vms);
>>>> +    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(MACHINE(vms));
>>>> +    bool pmu;
>>>>        int i;
>>>>        acpi_data_push(table_data, sizeof(AcpiMultipleApicTable));
>>>> @@ -491,11 +494,21 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>>>>        gicd->base_address = cpu_to_le64(memmap[VIRT_GIC_DIST].base);
>>>>        gicd->version = vms->gic_version;
>>>> -    for (i = 0; i < MACHINE(vms)->smp.cpus; i++) {
>>>> +    for (i = 0; i < possible_cpus->len; i++) {
>>>>            AcpiMadtGenericCpuInterface *gicc = acpi_data_push(table_data,
>>>>                                                               sizeof(*gicc));
>>>>            ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(i));
>>>> +        /*
>>>> +         * PMU should have been either implemented for all CPUs or not,
>>>> +         * so we only get information from the first CPU, which could
>>>> +         * represent the others.
>>>> +         */
>>>> +        if (i == 0) {
>>>> +            pmu = arm_feature(&armcpu->env, ARM_FEATURE_PMU);
>>>> +        }
>>>> +        assert(!armcpu || arm_feature(&armcpu->env, ARM_FEATURE_PMU) == pmu);
>>> This doesn't belong in this patch. The commit message doesn't even mention
>>> it. Also, I don't think we should do this here at all. If we want to
>>> ensure that all cpus have a pmu when one does, then that should be done
>>> somewhere like machvirt_init(), not in ACPI generation code which doesn't
>>> even run for non-ACPI VMs.
>> Sorry, I should have stated the reason of this change in the commit message.
>> Actually code change here and mp_affinity part below aim to make it correct
>> to create gicc entries for all possible cpus.
>>
>> We only initialize and realize cpuobj for present cpus in machvirt_init,
>> so that we will get null ARMCPU pointer here for the non-present cpus,
>> and consequently we won't able to check from "armcpu->env" for the
>> non-present cpus. The same about "armcpu->mp_affinity".
>>
>> That's the reason I use PMU configuration of the first cpu to represent the
>> others. I assume all cpus should have a pmu when one does here since it's
>> how armcpu->env is initialized. And the assert seems not needed here.
>>
>> Is there any better alternative way about this?
> Move the
>
>    if (arm_feature(&armcpu->env, ARM_FEATURE_PMU)) {
>        gicc->performance_interrupt = cpu_to_le32(PPI(VIRTUAL_PMU_IRQ));
>    }
>
> into the if (possible_cpus->cpus[i].cpu != NULL) block?
We can. But this will only ensure that we initialize 
gicc->performance_interrupt
for enabled GICC entries but not the disabled ones.
>>>> +
>>>>            gicc->type = ACPI_APIC_GENERIC_CPU_INTERFACE;
>>>>            gicc->length = sizeof(*gicc);
>>>>            if (vms->gic_version == 2) {
>>>> @@ -504,11 +517,19 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>>>>                gicc->gicv_base_address = cpu_to_le64(memmap[VIRT_GIC_VCPU].base);
>>>>            }
>>>>            gicc->cpu_interface_number = cpu_to_le32(i);
>>>> -        gicc->arm_mpidr = cpu_to_le64(armcpu->mp_affinity);
>>>> +        gicc->arm_mpidr = cpu_to_le64(possible_cpus->cpus[i].arch_id);
>>> Hmm, I think we may have a problem. I don't think there's any guarantee
>>> that possible_cpus->cpus[i].arch_id == armcpu->mp_affinity, because
>>> arch_id comes from virt_cpu_mp_affinity(), which is arm_cpu_mp_affinity,
>>> but with a variable cluster size, however mp_affinity comes from
>>> arm_cpu_mp_affinity with a set cluster size. Also, when KVM is used,
>>> then all bets are off as to what mp_affinity is.
>> Right! Arch_id is initialized by virt_cpu_mp_affinity() in machvirt and then
>> mp_affinity is initialized by arch_id. Here they two have the same value.
>>
>> But mp_affinity will be overridden in kvm_arch_init_vcpu() when KVM is
>> enabled. Here they two won't have the same value.
>>> We need to add some code that ensures arch_id == mp_affinity,
>> Can we also update the arch_id at the same time when we change mp_affinity?
> The proper fix is to send patches to KVM enabling userspace to control
> MPIDR. Otherwise we can't be sure we don't have inconsistencies in QEMU,
> since some user of possible_cpus could have made decisions or copied IDs
> prior to KVM vcpu init time. Now, all that said, I think
> virt_cpu_mp_affinity() should be generating the same ID as KVM does, so
> maybe it doesn't matter in practice right now, but we're living with the
> risk that KVM could change. For now, maybe we should just sanity check
> that the KVM values match the possible_cpus values and emit warnings if
> they don't?
I think it may not so reasonable to emit warnings if they don't match, on
the contrary we should ensure they will match even when KVM changes.

Now virt_cpu_mp_affinity() is only called by virt_possible_cpu_arch_ids() to
initialize possible_cpus, so an idea is that we can move the stuff of 
resetting
"cpu->mp_affinity" from kvm_arch_init_vcpu() to virt_cpu_mp_affinity() to
initialize arch_id. So that we can ensure mp_affinity only comes from 
arch_id
and won't change later. Can it work?

BTW, I plan to pack patch 4-6 into a separate patchset and repost it until
we have a mature solution for the probelm, that's also Salil's suggestion.
Is it appropriate?

Thanks,
Yanan
> Thanks,
> drew
>
> .