qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "wangyanan (Y)" <wangyanan55@huawei.com>
To: Andrew Jones <drjones@redhat.com>
Cc: Barry Song <song.bao.hua@hisilicon.com>,
	Peter Maydell <peter.maydell@linaro.org>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	wanghaibin.wang@huawei.com, qemu-devel@nongnu.org,
	yangyicong@huawei.com, Shannon Zhao <shannon.zhaosl@gmail.com>,
	qemu-arm@nongnu.org, Alistair Francis <alistair.francis@wdc.com>,
	prime.zeng@hisilicon.com, Igor Mammedov <imammedo@redhat.com>,
	yuzenghui@huawei.com, Paolo Bonzini <pbonzini@redhat.com>,
	zhukeqian1@huawei.com, Jiajie Li <lijiajie11@huawei.com>,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [RFC PATCH v2 2/4] hw/arm/virt: Parse -smp cluster parameter in virt_smp_parse
Date: Fri, 30 Apr 2021 16:59:36 +0800	[thread overview]
Message-ID: <612c71d5-83cd-d055-48a4-e06153837f8d@huawei.com> (raw)
In-Reply-To: <20210430064125.3b5fjolwqculrjxz@gator.home>


On 2021/4/30 14:41, Andrew Jones wrote:
> On Fri, Apr 30, 2021 at 01:09:00PM +0800, wangyanan (Y) wrote:
>> Hi Drew,
>>
>> On 2021/4/29 19:02, Andrew Jones wrote:
>>> On Thu, Apr 29, 2021 at 04:56:06PM +0800, wangyanan (Y) wrote:
>>>> On 2021/4/29 15:16, Andrew Jones wrote:
>>>>> On Thu, Apr 29, 2021 at 10:14:37AM +0800, wangyanan (Y) wrote:
>>>>>> On 2021/4/28 18:31, Andrew Jones wrote:
>>>>>>> On Tue, Apr 13, 2021 at 04:31:45PM +0800, Yanan Wang wrote:
>>>>>>>>              } else if (sockets == 0) {
>>>>>>>>                  threads = threads > 0 ? threads : 1;
>>>>>>>> -            sockets = cpus / (cores * threads);
>>>>>>>> +            sockets = cpus / (clusters * cores * threads);
>>>>>>>>                  sockets = sockets > 0 ? sockets : 1;
>>>>>>> If we initialize clusters to zero instead of one and add lines in
>>>>>>> 'cpus == 0 || cores == 0' and 'sockets == 0' like
>>>>>>> 'clusters = clusters > 0 ? clusters : 1' as needed, then I think we can
>>>>>>> add
>>>>>>>
>>>>>>>      } else if (clusters == 0) {
>>>>>>>          threads = threads > 0 ? threads : 1;
>>>>>>>          clusters = cpus / (sockets * cores * thread);
>>>>>>>          clusters = clusters > 0 ? clusters : 1;
>>>>>>>      }
>>>>>>>
>>>>>>> here.
>>>>>> I have thought about this kind of format before, but there is a little bit
>>>>>> difference between these two ways. Let's chose the better and more
>>>>>> reasonable one of the two.
>>>>>>
>>>>>> Way A currently in this patch:
>>>>>> If value of clusters is not explicitly specified in -smp command line, we
>>>>>> assume
>>>>>> that users don't want to support clusters, for compatibility we initialized
>>>>>> the
>>>>>> value to 1. So that with cmdline "-smp cpus=24, sockets=2, cores=6", we will
>>>>>> parse out the topology description like below:
>>>>>> cpus=24, sockets=2, clusters=1, cores=6, threads=2
>>>>>>
>>>>>> Way B that you suggested for this patch:
>>>>>> Whether value of clusters is explicitly specified in -smp command line or
>>>>>> not,
>>>>>> we assume that clusters are supported and calculate the value. So that with
>>>>>> cmdline "-smp cpus=24, sockets=2, cores=6", we will parse out the topology
>>>>>> description like below:
>>>>>> cpus =24, sockets=2, clusters=2, cores=6, threads=1
>>>>>>
>>>>>> But I think maybe we should not assume too much about what users think
>>>>>> through the -smp command line. We should just assume that all levels of
>>>>>> cpu topology are supported and calculate them, and users should be more
>>>>>> careful if they want to get the expected results with not so complete
>>>>>> cmdline.
>>>>>> If I'm right, then Way B should be better. :)
>>>>>>
>>>>> Hi Yanan,
>>>>>
>>>>> We're already assuming the user wants to describe clusters to the guest
>>>>> because we require at least one per socket. If we want the user to have a
>>>>> choice between using clusters or not, then I guess we need to change the
>>>>> logic in the PPTT and the cpu-map to only generate the cluster level when
>>>>> the number of clusters is not zero. And then change this parser to not
>>>>> require clusters at all.
>>>> Hi Drew,
>>>>
>>>> I think this kind of change will introduce more complexity and actually is
>>>> not necessary.
>>>> Not generating cluster level at all and generating cluster level (one per
>>>> socket) are same
>>>> to kernel. Without cluster description provided, kernel will initialize all
>>>> cores in the same
>>>> cluster which also means one cluster per socket.
>>> Which kernel? All kernels? One without the cluster support patches [1]?
>>>
>>> [1] https://lore.kernel.org/lkml/20210420001844.9116-1-song.bao.hua@hisilicon.com/#t
>> I'm sorry, I didn't make it clear. :)
>> I actually mean the ARM64 kernel, with or without [1].
>>
>> Without [1]: Kernel doesn't care about cluster. When populating cpu
>> topology, it directly
>> finds the hierarchy node with "physical package flag" as package when
>> parsing ACPI, and
>> finds the core node's parent as package when parsing DT. So even we provide
>> cluster level
>> description (one per socket), the parsing results will be the same as not
>> providing at all.
>>
>> With [1]: Kernel finds the core hierarchy node's parent as cluster when
>> parsing ACPI. So if
>> we don't provide cluster level description, package will be taken as
>> cluster. And if we provide
>> the description (one per socket), the parsing result will also be the same.
>>
>> That's why I said that we just need to provide description of cluster (one
>> per socket) if we
>> don't want to make use of it in VMs.
> OK, that sounds good.
>
>> [1] https://lore.kernel.org/lkml/20210420001844.9116-1-song.bao.hua@hisilicon.com/#t
>>>> So we should only ensure value of clusters per socket is one if we don't
>>>> want to use clusters,
>>>> and don't need to care about whether or not to generate description in PPTT
>>>> and cpu-map.
>>>> Is this right?
>>> Depends on your answer to my 'which kernel' questions.
>>>
>>>>> I'm not a big fan of these auto-calculated values either, but the
>>>>> documentation says that it'll do that, and it's been done that way
>>>>> forever, so I think we're stuck with it for the -smp option. Hmm, I was
>>>>> just about to say that x86 computes all its values, but I see the most
>>>>> recently added one, 'dies', is implemented the way you're proposing we
>>>>> implement 'clusters', i.e. default to one and don't calculate it when it's
>>>>> missing. I actually consider that either a documentation bug or an smp
>>>>> parsing bug, though.
>>>> My propose originally came from implementation of x86.
>>>>> Another possible option, for Arm, because only the cpus and maxcpus
>>>>> parameters of -smp have ever worked, is to document, for Arm, that if even
>>>>> one parameter other than cpus or maxcpus is provided, then all parameters
>>>>> must be provided. We can still decide if clusters=0 is valid, but we'll
>>>>> enforce that everything is explicit and that the product (with or without
>>>>> clusters) matches maxcpus.
>>>> Requiring every parameter explicitly will be most stable but indeed strict.
>>>>
>>>> Currently all the parsers use way B to calculate value of thread if it is
>>>> not provided explicitly.
>>>> So users should ensure the -smp cmdline they provided can result in that
>>>> parsed threads will
>>>> be 1 if they don't want to support multiple threads in one core.
>>>>
>>>> Very similar to thread, users should also ensure the provided cmdline can
>>>> result in that parsed
>>>> clusters will be 1 if they don't want to support multiple clusters in one
>>>> socket.
>>>>
>>>> So I'm wondering if we can just add some commit in the documentation to tell
>>>> users that they
>>>> should ensure this if they don't want support it. And as for calculation of
>>>> clusters, we follow
>>>> the logic of other parameters as you suggested in way B.
>>>>
>>>> Thanks,
>>>> Yanan
>>>>> Requiring every parameter might be stricter than necessary, though, I
>>>>> think we're mostly concerned with cpus/maxcpus, sockets, and cores.
>>>>> clusters can default to one or zero (whatever we choose and document),
>>>>> threads can default to one, and cpus can default to maxcpus or maxcpus can
>>>>> default to cpus, but at least one of those must be provided. And, if
>>>>> sockets are provided, then cores must be provided and vice versa. If
>>>>> neither sockets nor cores are provided, then nothing else besides cpus and
>>>>> maxcpus may be provided, and that would mean to not generate any topology
>>>>> descriptions for the guest.
>>> I still don't know. I think I prefer making -smp more strict (even for
>>> other architectures, but that's more difficult to do than for Arm.) What I
>>> wrote above isn't that bad. We only require one of cpus or maxcpus (pretty
>>> much everybody already does that anyway), and then, if given sockets
>>> or cores, the other will also be required. I assume anybody who bothers to
>>> specify one or the other would already specify both anyway.
>> I agree to make -smp more strict. We want to expose the cpu topology
>> information
>> to guest kernel, so that it can take advantage of the information for better
>> scheduling.
>>  From this point of view, we hope the topology descriptions are accurately
>> specified
>> but not automatically populated.
>>
>> But I think the requirement for ARM "if even one parameter other than cpus
>> or maxcpus
>> is provided then all parameters must be provided" will be better. This can
>> ensure the
>> whole accurate users-specified topology. As you mentioned, if anybody who
>> bothers
>> to specify one, why not also specify the others.
>>
>> I can add the requirement for ARM in the documentation, and also check the
>> parameters
>> in virt_smp_parse. Will this be fine?
> We sort of have to support command lines that are missing 'maxcpus' and
> 'clusters', unless we work together with libvirt to make the change.
> Currently libvirt will generate '-smp 16,sockets=16,cores=1,threads=1'
> from '<vcpu placement='static'>16</vcpu>'.
I see. And libvirt currently doesn't support cluster in xml, which means
we can not generate complete cmdlines with cluster in it through
<topology ...> specification in xml.
> That's sufficient for our
> stricter, but not completely strict requirements. And, I still think
> 'threads' could be optional, because there's a good chance the user
> doesn't want to describe them, so a default of 1 is good enough.
So the parsing logic can be repeated like this:
We require at least one of cpus and maxcpus specified explicitly, and 
default
cluster/thread to 1 if not explicitly specified. And require both of 
sockets and
cores if one of them is specified.

This is consistent with what you mentioned yesterday.

Thanks,
Yanan
>   Also,
> given maxcpus, but not cpus, it's pretty obvious that cpus should equal
> maxcpus.
>
> Thanks,
> drew
>
> .


  parent reply	other threads:[~2021-04-30  9:01 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-13  8:31 [RFC PATCH v2 0/4] hw/arm/virt: Introduce cluster cpu topology support Yanan Wang
2021-04-13  8:31 ` [RFC PATCH v2 1/4] vl.c: Add -smp, clusters=* command line support for ARM cpu Yanan Wang
2021-04-28 10:23   ` Andrew Jones
2021-04-29  1:22     ` wangyanan (Y)
2021-04-13  8:31 ` [RFC PATCH v2 2/4] hw/arm/virt: Parse -smp cluster parameter in virt_smp_parse Yanan Wang
2021-04-28 10:31   ` Andrew Jones
2021-04-29  2:14     ` wangyanan (Y)
2021-04-29  7:16       ` Andrew Jones
2021-04-29  8:56         ` wangyanan (Y)
2021-04-29 11:02           ` Andrew Jones
2021-04-30  5:09             ` wangyanan (Y)
2021-04-30  6:41               ` Andrew Jones
2021-04-30  7:01                 ` Andrew Jones
2021-04-30  9:33                   ` wangyanan (Y)
2021-04-30 10:49                     ` Andrew Jones
2021-05-06  7:04                       ` wangyanan (Y)
2021-04-30  8:59                 ` wangyanan (Y) [this message]
2021-04-30 10:48                   ` Andrew Jones
2021-04-13  8:31 ` [RFC PATCH v2 3/4] hw/arm/virt-acpi-build: Add cluster level for PPTT table Yanan Wang
2021-04-28 10:41   ` Andrew Jones
2021-04-13  8:31 ` [RFC PATCH v2 4/4] hw/arm/virt: Add cluster level for device tree Yanan Wang
2021-04-28 10:46   ` Andrew Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=612c71d5-83cd-d055-48a4-e06153837f8d@huawei.com \
    --to=wangyanan55@huawei.com \
    --cc=alistair.francis@wdc.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=drjones@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=lijiajie11@huawei.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=prime.zeng@hisilicon.com \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=shannon.zhaosl@gmail.com \
    --cc=song.bao.hua@hisilicon.com \
    --cc=wanghaibin.wang@huawei.com \
    --cc=yangyicong@huawei.com \
    --cc=yuzenghui@huawei.com \
    --cc=zhukeqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).