qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Jones <drjones@redhat.com>
To: "wangyanan (Y)" <wangyanan55@huawei.com>
Cc: Barry Song <song.bao.hua@hisilicon.com>,
	Peter Maydell <peter.maydell@linaro.org>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	wanghaibin.wang@huawei.com, qemu-devel@nongnu.org,
	yangyicong@huawei.com, Shannon Zhao <shannon.zhaosl@gmail.com>,
	qemu-arm@nongnu.org, Alistair Francis <alistair.francis@wdc.com>,
	prime.zeng@hisilicon.com, Igor Mammedov <imammedo@redhat.com>,
	yuzenghui@huawei.com, Paolo Bonzini <pbonzini@redhat.com>,
	zhukeqian1@huawei.com, Jiajie Li <lijiajie11@huawei.com>,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [RFC PATCH v2 2/4] hw/arm/virt: Parse -smp cluster parameter in virt_smp_parse
Date: Fri, 30 Apr 2021 08:41:25 +0200	[thread overview]
Message-ID: <20210430064125.3b5fjolwqculrjxz@gator.home> (raw)
In-Reply-To: <f5b264ff-58ed-0cd2-3b84-42fa1724b8ac@huawei.com>

On Fri, Apr 30, 2021 at 01:09:00PM +0800, wangyanan (Y) wrote:
> Hi Drew,
> 
> On 2021/4/29 19:02, Andrew Jones wrote:
> > On Thu, Apr 29, 2021 at 04:56:06PM +0800, wangyanan (Y) wrote:
> > > On 2021/4/29 15:16, Andrew Jones wrote:
> > > > On Thu, Apr 29, 2021 at 10:14:37AM +0800, wangyanan (Y) wrote:
> > > > > On 2021/4/28 18:31, Andrew Jones wrote:
> > > > > > On Tue, Apr 13, 2021 at 04:31:45PM +0800, Yanan Wang wrote:
> > > > > > >             } else if (sockets == 0) {
> > > > > > >                 threads = threads > 0 ? threads : 1;
> > > > > > > -            sockets = cpus / (cores * threads);
> > > > > > > +            sockets = cpus / (clusters * cores * threads);
> > > > > > >                 sockets = sockets > 0 ? sockets : 1;
> > > > > > If we initialize clusters to zero instead of one and add lines in
> > > > > > 'cpus == 0 || cores == 0' and 'sockets == 0' like
> > > > > > 'clusters = clusters > 0 ? clusters : 1' as needed, then I think we can
> > > > > > add
> > > > > > 
> > > > > >     } else if (clusters == 0) {
> > > > > >         threads = threads > 0 ? threads : 1;
> > > > > >         clusters = cpus / (sockets * cores * thread);
> > > > > >         clusters = clusters > 0 ? clusters : 1;
> > > > > >     }
> > > > > > 
> > > > > > here.
> > > > > I have thought about this kind of format before, but there is a little bit
> > > > > difference between these two ways. Let's chose the better and more
> > > > > reasonable one of the two.
> > > > > 
> > > > > Way A currently in this patch:
> > > > > If value of clusters is not explicitly specified in -smp command line, we
> > > > > assume
> > > > > that users don't want to support clusters, for compatibility we initialized
> > > > > the
> > > > > value to 1. So that with cmdline "-smp cpus=24, sockets=2, cores=6", we will
> > > > > parse out the topology description like below:
> > > > > cpus=24, sockets=2, clusters=1, cores=6, threads=2
> > > > > 
> > > > > Way B that you suggested for this patch:
> > > > > Whether value of clusters is explicitly specified in -smp command line or
> > > > > not,
> > > > > we assume that clusters are supported and calculate the value. So that with
> > > > > cmdline "-smp cpus=24, sockets=2, cores=6", we will parse out the topology
> > > > > description like below:
> > > > > cpus =24, sockets=2, clusters=2, cores=6, threads=1
> > > > > 
> > > > > But I think maybe we should not assume too much about what users think
> > > > > through the -smp command line. We should just assume that all levels of
> > > > > cpu topology are supported and calculate them, and users should be more
> > > > > careful if they want to get the expected results with not so complete
> > > > > cmdline.
> > > > > If I'm right, then Way B should be better. :)
> > > > > 
> > > > Hi Yanan,
> > > > 
> > > > We're already assuming the user wants to describe clusters to the guest
> > > > because we require at least one per socket. If we want the user to have a
> > > > choice between using clusters or not, then I guess we need to change the
> > > > logic in the PPTT and the cpu-map to only generate the cluster level when
> > > > the number of clusters is not zero. And then change this parser to not
> > > > require clusters at all.
> > > Hi Drew,
> > > 
> > > I think this kind of change will introduce more complexity and actually is
> > > not necessary.
> > > Not generating cluster level at all and generating cluster level (one per
> > > socket) are same
> > > to kernel. Without cluster description provided, kernel will initialize all
> > > cores in the same
> > > cluster which also means one cluster per socket.
> > Which kernel? All kernels? One without the cluster support patches [1]?
> > 
> > [1] https://lore.kernel.org/lkml/20210420001844.9116-1-song.bao.hua@hisilicon.com/#t
> I'm sorry, I didn't make it clear. :)
> I actually mean the ARM64 kernel, with or without [1].
> 
> Without [1]: Kernel doesn't care about cluster. When populating cpu
> topology, it directly
> finds the hierarchy node with "physical package flag" as package when
> parsing ACPI, and
> finds the core node's parent as package when parsing DT. So even we provide
> cluster level
> description (one per socket), the parsing results will be the same as not
> providing at all.
> 
> With [1]: Kernel finds the core hierarchy node's parent as cluster when
> parsing ACPI. So if
> we don't provide cluster level description, package will be taken as
> cluster. And if we provide
> the description (one per socket), the parsing result will also be the same.
> 
> That's why I said that we just need to provide description of cluster (one
> per socket) if we
> don't want to make use of it in VMs.

OK, that sounds good.

> 
> [1] https://lore.kernel.org/lkml/20210420001844.9116-1-song.bao.hua@hisilicon.com/#t
> > > So we should only ensure value of clusters per socket is one if we don't
> > > want to use clusters,
> > > and don't need to care about whether or not to generate description in PPTT
> > > and cpu-map.
> > > Is this right?
> > Depends on your answer to my 'which kernel' questions.
> > 
> > > > I'm not a big fan of these auto-calculated values either, but the
> > > > documentation says that it'll do that, and it's been done that way
> > > > forever, so I think we're stuck with it for the -smp option. Hmm, I was
> > > > just about to say that x86 computes all its values, but I see the most
> > > > recently added one, 'dies', is implemented the way you're proposing we
> > > > implement 'clusters', i.e. default to one and don't calculate it when it's
> > > > missing. I actually consider that either a documentation bug or an smp
> > > > parsing bug, though.
> > > My propose originally came from implementation of x86.
> > > > Another possible option, for Arm, because only the cpus and maxcpus
> > > > parameters of -smp have ever worked, is to document, for Arm, that if even
> > > > one parameter other than cpus or maxcpus is provided, then all parameters
> > > > must be provided. We can still decide if clusters=0 is valid, but we'll
> > > > enforce that everything is explicit and that the product (with or without
> > > > clusters) matches maxcpus.
> > > Requiring every parameter explicitly will be most stable but indeed strict.
> > > 
> > > Currently all the parsers use way B to calculate value of thread if it is
> > > not provided explicitly.
> > > So users should ensure the -smp cmdline they provided can result in that
> > > parsed threads will
> > > be 1 if they don't want to support multiple threads in one core.
> > > 
> > > Very similar to thread, users should also ensure the provided cmdline can
> > > result in that parsed
> > > clusters will be 1 if they don't want to support multiple clusters in one
> > > socket.
> > > 
> > > So I'm wondering if we can just add some commit in the documentation to tell
> > > users that they
> > > should ensure this if they don't want support it. And as for calculation of
> > > clusters, we follow
> > > the logic of other parameters as you suggested in way B.
> > > 
> > > Thanks,
> > > Yanan
> > > > Requiring every parameter might be stricter than necessary, though, I
> > > > think we're mostly concerned with cpus/maxcpus, sockets, and cores.
> > > > clusters can default to one or zero (whatever we choose and document),
> > > > threads can default to one, and cpus can default to maxcpus or maxcpus can
> > > > default to cpus, but at least one of those must be provided. And, if
> > > > sockets are provided, then cores must be provided and vice versa. If
> > > > neither sockets nor cores are provided, then nothing else besides cpus and
> > > > maxcpus may be provided, and that would mean to not generate any topology
> > > > descriptions for the guest.
> > I still don't know. I think I prefer making -smp more strict (even for
> > other architectures, but that's more difficult to do than for Arm.) What I
> > wrote above isn't that bad. We only require one of cpus or maxcpus (pretty
> > much everybody already does that anyway), and then, if given sockets
> > or cores, the other will also be required. I assume anybody who bothers to
> > specify one or the other would already specify both anyway.
> I agree to make -smp more strict. We want to expose the cpu topology
> information
> to guest kernel, so that it can take advantage of the information for better
> scheduling.
> From this point of view, we hope the topology descriptions are accurately
> specified
> but not automatically populated.
> 
> But I think the requirement for ARM "if even one parameter other than cpus
> or maxcpus
> is provided then all parameters must be provided" will be better. This can
> ensure the
> whole accurate users-specified topology. As you mentioned, if anybody who
> bothers
> to specify one, why not also specify the others.
> 
> I can add the requirement for ARM in the documentation, and also check the
> parameters
> in virt_smp_parse. Will this be fine?

We sort of have to support command lines that are missing 'maxcpus' and
'clusters', unless we work together with libvirt to make the change.
Currently libvirt will generate '-smp 16,sockets=16,cores=1,threads=1'
from '<vcpu placement='static'>16</vcpu>'. That's sufficient for our
stricter, but not completely strict requirements. And, I still think
'threads' could be optional, because there's a good chance the user
doesn't want to describe them, so a default of 1 is good enough. Also,
given maxcpus, but not cpus, it's pretty obvious that cpus should equal
maxcpus.

Thanks,
drew



  reply	other threads:[~2021-04-30  6:56 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-13  8:31 [RFC PATCH v2 0/4] hw/arm/virt: Introduce cluster cpu topology support Yanan Wang
2021-04-13  8:31 ` [RFC PATCH v2 1/4] vl.c: Add -smp, clusters=* command line support for ARM cpu Yanan Wang
2021-04-28 10:23   ` Andrew Jones
2021-04-29  1:22     ` wangyanan (Y)
2021-04-13  8:31 ` [RFC PATCH v2 2/4] hw/arm/virt: Parse -smp cluster parameter in virt_smp_parse Yanan Wang
2021-04-28 10:31   ` Andrew Jones
2021-04-29  2:14     ` wangyanan (Y)
2021-04-29  7:16       ` Andrew Jones
2021-04-29  8:56         ` wangyanan (Y)
2021-04-29 11:02           ` Andrew Jones
2021-04-30  5:09             ` wangyanan (Y)
2021-04-30  6:41               ` Andrew Jones [this message]
2021-04-30  7:01                 ` Andrew Jones
2021-04-30  9:33                   ` wangyanan (Y)
2021-04-30 10:49                     ` Andrew Jones
2021-05-06  7:04                       ` wangyanan (Y)
2021-04-30  8:59                 ` wangyanan (Y)
2021-04-30 10:48                   ` Andrew Jones
2021-04-13  8:31 ` [RFC PATCH v2 3/4] hw/arm/virt-acpi-build: Add cluster level for PPTT table Yanan Wang
2021-04-28 10:41   ` Andrew Jones
2021-04-13  8:31 ` [RFC PATCH v2 4/4] hw/arm/virt: Add cluster level for device tree Yanan Wang
2021-04-28 10:46   ` Andrew Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210430064125.3b5fjolwqculrjxz@gator.home \
    --to=drjones@redhat.com \
    --cc=alistair.francis@wdc.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=imammedo@redhat.com \
    --cc=lijiajie11@huawei.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=prime.zeng@hisilicon.com \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=shannon.zhaosl@gmail.com \
    --cc=song.bao.hua@hisilicon.com \
    --cc=wanghaibin.wang@huawei.com \
    --cc=wangyanan55@huawei.com \
    --cc=yangyicong@huawei.com \
    --cc=yuzenghui@huawei.com \
    --cc=zhukeqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).