From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.0 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5393C4338F for ; Tue, 17 Aug 2021 02:11:52 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 11B7160BD3 for ; Tue, 17 Aug 2021 02:11:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 11B7160BD3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:33286 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mFoZz-00088Z-6f for qemu-devel@archiver.kernel.org; Mon, 16 Aug 2021 22:11:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39636) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mFoZ5-0007H1-KA; Mon, 16 Aug 2021 22:10:55 -0400 Received: from szxga02-in.huawei.com ([45.249.212.188]:2084) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mFoZ2-0007cc-51; Mon, 16 Aug 2021 22:10:55 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.55]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4GpZCn6NkDzdbgk; Tue, 17 Aug 2021 10:07:01 +0800 (CST) Received: from dggpemm500023.china.huawei.com (7.185.36.83) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Tue, 17 Aug 2021 10:10:45 +0800 Received: from [10.174.187.128] (10.174.187.128) by dggpemm500023.china.huawei.com (7.185.36.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2176.2; Tue, 17 Aug 2021 10:10:44 +0800 Subject: Re: [PATCH for-6.2 v5 3/5] hw/arm/virt: Add cpu-map to device tree To: Peter Maydell , Andrew Jones , "Michael S . Tsirkin" , Igor Mammedov References: <20210805123921.62540-1-wangyanan55@huawei.com> <20210805123921.62540-4-wangyanan55@huawei.com> From: "wangyanan (Y)" Message-ID: <3bde66bd-d0ea-0960-b171-3bbd1990d977@huawei.com> Date: Tue, 17 Aug 2021 10:10:44 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0 MIME-Version: 1.0 In-Reply-To: <20210805123921.62540-4-wangyanan55@huawei.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Originating-IP: [10.174.187.128] X-ClientProxiedBy: dggeme714-chm.china.huawei.com (10.1.199.110) To dggpemm500023.china.huawei.com (7.185.36.83) X-CFilter-Loop: Reflected Received-SPF: pass client-ip=45.249.212.188; envelope-from=wangyanan55@huawei.com; helo=szxga02-in.huawei.com X-Spam_score_int: -78 X-Spam_score: -7.9 X-Spam_bar: ------- X-Spam_report: (-7.9 / 5.0 requ) BAYES_00=-1.9, NICE_REPLY_A=-3.71, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Salil Mehta , qemu-devel@nongnu.org, Shannon Zhao , qemu-arm@nongnu.org, Alistair Francis , wanghaibin.wang@huawei.com, David Gibson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Hi, On 2021/8/5 20:39, Yanan Wang wrote: > From: Andrew Jones > > Support device tree CPU topology descriptions. > > In accordance with the Devicetree Specification, the Linux Doc > "arm/cpus.yaml" requires that cpus and cpu nodes in the DT are > present. And we have already met the requirement by generating > /cpus/cpu@* nodes for members within ms->smp.cpus. Accordingly, > we should also create subnodes in cpu-map for the present cpus, > each of which relates to an unique cpu node. > > The Linux Doc "cpu/cpu-topology.txt" states that the hierarchy > of CPUs in a SMP system is defined through four entities and > they are socket/cluster/core/thread. It is also required that > a socket node's child nodes must be one or more cluster nodes. > Given that currently we are only provided with information of > socket/core/thread, we assume there is one cluster child node > in each socket node when creating cpu-map. > > Signed-off-by: Andrew Jones > Co-developed-by: Yanan Wang > Signed-off-by: Yanan Wang > --- > hw/arm/virt.c | 59 ++++++++++++++++++++++++++++++++++++++++++--------- > 1 file changed, 49 insertions(+), 10 deletions(-) > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > index 82f2eba6bd..d1e294be95 100644 > --- a/hw/arm/virt.c > +++ b/hw/arm/virt.c > @@ -350,20 +350,21 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms) > int cpu; > int addr_cells = 1; > const MachineState *ms = MACHINE(vms); > + const VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms); > int smp_cpus = ms->smp.cpus; > > /* > - * From Documentation/devicetree/bindings/arm/cpus.txt > - * On ARM v8 64-bit systems value should be set to 2, > - * that corresponds to the MPIDR_EL1 register size. > - * If MPIDR_EL1[63:32] value is equal to 0 on all CPUs > - * in the system, #address-cells can be set to 1, since > - * MPIDR_EL1[63:32] bits are not used for CPUs > - * identification. > + * See Linux Documentation/devicetree/bindings/arm/cpus.yaml > + * On ARM v8 64-bit systems value should be set to 2, > + * that corresponds to the MPIDR_EL1 register size. > + * If MPIDR_EL1[63:32] value is equal to 0 on all CPUs > + * in the system, #address-cells can be set to 1, since > + * MPIDR_EL1[63:32] bits are not used for CPUs > + * identification. > * > - * Here we actually don't know whether our system is 32- or 64-bit one. > - * The simplest way to go is to examine affinity IDs of all our CPUs. If > - * at least one of them has Aff3 populated, we set #address-cells to 2. > + * Here we actually don't know whether our system is 32- or 64-bit one. > + * The simplest way to go is to examine affinity IDs of all our CPUs. If > + * at least one of them has Aff3 populated, we set #address-cells to 2. > */ > for (cpu = 0; cpu < smp_cpus; cpu++) { > ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(cpu)); > @@ -406,8 +407,46 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms) > ms->possible_cpus->cpus[cs->cpu_index].props.node_id); > } > > + if (!vmc->no_cpu_topology) { > + qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle", > + qemu_fdt_alloc_phandle(ms->fdt)); > + } > + > g_free(nodename); > } > + > + if (!vmc->no_cpu_topology) { > + /* > + * See Linux Documentation/devicetree/bindings/cpu/cpu-topology.txt > + * In a SMP system, the hierarchy of CPUs is defined through four > + * entities that are used to describe the layout of physical CPUs > + * in the system: socket/cluster/core/thread. > + */ > + qemu_fdt_add_subnode(ms->fdt, "/cpus/cpu-map"); > + > + for (cpu = smp_cpus - 1; cpu >= 0; cpu--) { > + char *cpu_path = g_strdup_printf("/cpus/cpu@%d", cpu); > + char *map_path; > + > + if (ms->smp.threads > 1) { > + map_path = g_strdup_printf( > + "/cpus/cpu-map/socket%d/cluster0/core%d/thread%d", > + cpu / (ms->smp.cores * ms->smp.threads), > + (cpu / ms->smp.threads) % ms->smp.cores, > + cpu % ms->smp.threads); It seems that there is some discrepancy between the documentation (Documentation/devicetree/bindings/cpu/cpu-topology.txt) and the actual implementation of DT topology parser for ARM64 (function parse_dt_topology() in drivers/base/arch_topology.c). The doc says the cpu-map node's child nodes can be:     - one or more cluster nodes or     - one or more socket nodes in a multi-socket system which means a cpu-map can be defined as two formats such as: 1) cpu-map                    socket0                                 cluster0                                              core0                                              core1                                 cluster1                                              core0                                              core1                    socket1                                 cluster0                                              core0                                              core1                                 cluster1                                              core0                                              core1 2) cpu-map                    cluster0                                 cluster0                                              core0                                              core1                                 cluster1                                              core0                                              core1                    cluster1                                 cluster0                                              core0                                              core1                                 cluster1                                              core0                                              core1 But current parser only assumes that there are nested clusters within cpu-map and is unaware of socket, the parser also ignore any information about the nesting of clusters and present the scheduler with a flat list of them. So based on current parser, we will get 4 packages (sockets) totally, 2 cores per package, 1 threads per core from 2), but will get nothing useful from 1). I think the ARM64 kernel DT parser should be optimized so that it's also aware of sockets and can parse both formats of cpu-map. But before this, I think we still have to build the cpu-map in format 2) if we hope to describe topology successfully through DT. :) Thanks, Yanan . > + } else { > + map_path = g_strdup_printf( > + "/cpus/cpu-map/socket%d/cluster0/core%d", > + cpu / ms->smp.cores, > + cpu % ms->smp.cores); > + } > + qemu_fdt_add_path(ms->fdt, map_path); > + qemu_fdt_setprop_phandle(ms->fdt, map_path, "cpu", cpu_path); > + > + g_free(map_path); > + g_free(cpu_path); > + } > + } > } > > static void fdt_add_its_gic_node(VirtMachineState *vms)