All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Zengtao (B)" <prime.zeng@hisilicon.com>
To: Sudeep Holla <sudeep.holla@arm.com>
Cc: Linuxarm <linuxarm@huawei.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Morten Rasmussen" <morten.rasmussen@arm.com>
Subject: RE: [PATCH] cpu-topology: warn if NUMA configurations conflicts with lower layer
Date: Thu, 2 Jan 2020 12:47:01 +0000	[thread overview]
Message-ID: <678F3D1BB717D949B966B68EAEB446ED340AEB67@dggemm526-mbx.china.huawei.com> (raw)
In-Reply-To: <20200102112955.GC4864@bogus>

> -----Original Message-----
> From: Sudeep Holla [mailto:sudeep.holla@arm.com]
> Sent: Thursday, January 02, 2020 7:30 PM
> To: Zengtao (B)
> Cc: Linuxarm; Greg Kroah-Hartman; Rafael J. Wysocki;
> linux-kernel@vger.kernel.org; Morten Rasmussen
> Subject: Re: [PATCH] cpu-topology: warn if NUMA configurations conflicts
> with lower layer
> 
> On Thu, Jan 02, 2020 at 03:05:40AM +0000, Zengtao (B) wrote:
> > Hi Sudeep:
> >
> > Thanks for your reply.
> >
> > > -----Original Message-----
> > > From: Sudeep Holla [mailto:sudeep.holla@arm.com]
> > > Sent: Wednesday, January 01, 2020 12:41 AM
> > > To: Zengtao (B)
> > > Cc: Linuxarm; Greg Kroah-Hartman; Rafael J. Wysocki;
> > > linux-kernel@vger.kernel.org; Sudeep Holla; Morten Rasmussen
> > > Subject: Re: [PATCH] cpu-topology: warn if NUMA configurations
> conflicts
> > > with lower layer
> > >
> > > On Mon, Dec 23, 2019 at 04:16:19PM +0800, z00214469 wrote:
> > > > As we know, from sched domain's perspective, the DIE layer should
> be
> > > > larger than or at least equal to the MC layer, and in some cases, MC
> > > > is defined by the arch specified hardware, MPIDR for example, but
> > > NUMA
> > > > can be defined by users,
> > >
> > > Who are the users you are referring above ?
> > For example, when I use QEMU to start a guest linux, I can define the
> > NUMA topology of the guest linux whatever i want.
> 
> OK and how is the information passed to the kernel ? DT or ACPI ?
> We need to fix the miss match if any during the initial parse of those
> information.
> 

Both, For the current QEMU, we don't have the correct cpu topology
passed to linux. Luckily drjones planed to deal with the issue.
https://patchwork.ozlabs.org/cover/939301/

> > > > with the following system configrations:
> > >
> > > Do you mean ACPI tables or DT or some firmware tables ?
> > >
> > > > *************************************
> > > > NUMA:      	 0-2,  3-7
> > >
> > > Is the above simply wrong with respect to hardware and it actually
> match
> > > core_siblings ?
> > >
> > Actually, we can't simply say this is wrong, i just want to show an
> example.
> > And this example also can be:
> > NUMA:  0-23,  24-47
> > core_siblings:   0-15,  16-31, 32-47
> >
> 
> Are you sure of the above ? Possible values w.r.t hardware config:
> core_siblings:   0-15,  16-23, 24-31, 32-47
> 
> But what you have specified above is still wrong core_siblings IMO.
> 
It depends on the hardware, on my platform, 16 cores per cluster.

> 
> [...]
> 
> > > > diff --git a/drivers/base/arch_topology.c
> b/drivers/base/arch_topology.c
> > > > index 1eb81f11..5fe44b3 100644
> > > > --- a/drivers/base/arch_topology.c
> > > > +++ b/drivers/base/arch_topology.c
> > > > @@ -439,10 +439,18 @@ const struct cpumask
> > > *cpu_coregroup_mask(int cpu)
> > > >  	if (cpumask_subset(&cpu_topology[cpu].core_sibling,
> core_mask)) {
> > > >  		/* not numa in package, lets use the package siblings */
> > > >  		core_mask = &cpu_topology[cpu].core_sibling;
> > > > -	}
> > > > +	} else
> > > > +		pr_warn_once("Warning: suspicous broken topology:
> cpu:[%d]'s
> > > core_sibling:[%*pbl] not a subset of numa node:[%*pbl]\n",
> > > > +			cpu,
> cpumask_pr_args(&cpu_topology[cpu].core_sibling),
> > > > +			cpumask_pr_args(core_mask));
> > > > +
> > >
> > > Won't this print warning on all systems that don't have numa within a
> > > package ? What are you trying to achieve here ?
> >
> > Since in my case, when this corner case happens, the linux kernel just fall
> into
> > dead loop with no prompt, here this is a helping message will help a lot.
> >
> 
> As I said, wrong configurations need to be detected when generating
> DT/ACPI if possible. The above will print warning on systems with NUMA
> within package.
> 
> NUMA:  0-7, 8-15
> core_siblings:   0-15
> 
> The above is the example where the die has 16 CPUs and 2 NUMA nodes
> within a package, your change throws error to the above config which is
> wrong.
>
From your example, the core 7 and core 8 has got different LLC but the same Low
Level cache?
From schedule view of point, lower level sched domain should be a subset of higher
Level sched domain.

> > >
> > > >  	if (cpu_topology[cpu].llc_id != -1) {
> > > >  		if (cpumask_subset(&cpu_topology[cpu].llc_sibling,
> core_mask))
> > > >  			core_mask = &cpu_topology[cpu].llc_sibling;
> > > > +		else
> > > > +			pr_warn_once("Warning: suspicous broken topology:
> > > cpu:[%d]'s llc_sibling:[%*pbl] not a subset of numa node:[%*pbl]\n",
> > > > +				cpu,
> > > cpumask_pr_args(&cpu_topology[cpu].llc_sibling),
> > > > +				cpumask_pr_args(core_mask));
> > > >  	}
> > > >
> > >
> > > This will trigger warning on all systems that lack cacheinfo topology.
> > > I don't understand the intent of this patch at all. Can you explain
> > > all the steps you follow and the issue you face ?
> >
> > Can you show me an example, what I really want to warn is the case that
> > NUMA topology conflicts with lower level.
> >
> 
> I was wrong here, I mis-read this section. I still fail to understand
> why the above change is needed. I understood the QEMU example, but you
> haven't specified how cacheinfo looks like there.
> 

My idea:
(1) If the dtb/acpi is not legal, the linux kernel should check or prompt to help debug.
(2)From scheduler view of point, low level sched domain should be a subset of high level
Sched domain.

I am very sure about (2), any example to show me (2) is wrong?

Thanks. 
Zengtao 


  reply	other threads:[~2020-01-02 12:47 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-23  8:16 [PATCH] cpu-topology: warn if NUMA configurations conflicts with lower layer z00214469
2019-12-31 16:40 ` Sudeep Holla
2020-01-02  3:05   ` Zengtao (B)
2020-01-02 11:29     ` Sudeep Holla
2020-01-02 12:47       ` Zengtao (B) [this message]
2020-01-02 13:22         ` Valentin Schneider
2020-01-02 19:30           ` Dietmar Eggemann
2020-01-03  4:24           ` Zengtao (B)
2020-01-03 10:57             ` Valentin Schneider
2020-01-03 12:14               ` Valentin Schneider
2020-01-03 17:20                 ` Dietmar Eggemann
2020-01-06  1:48                   ` Zengtao (B)
2020-01-06 14:31                     ` Dietmar Eggemann
2020-01-08  2:19                       ` Zengtao (B)
2020-01-09 11:05                       ` Morten Rasmussen
2020-01-09 12:07                         ` Dietmar Eggemann
2020-01-06  1:52                 ` Zengtao (B)
2020-01-03 11:40             ` Sudeep Holla
2020-01-06  1:37               ` Zengtao (B)
2020-01-09 10:43                 ` Morten Rasmussen
2020-01-09 12:58                   ` Zengtao (B)
2020-01-11 20:56                     ` Valentin Schneider
2020-01-13  6:51                       ` Zengtao (B)
2020-01-13 11:16                         ` Valentin Schneider
2020-01-13 12:08                           ` Zengtao (B)
2020-01-13 12:22                             ` Dietmar Eggemann
2020-01-13 14:49                       ` Dietmar Eggemann
2020-01-13 15:15                         ` Valentin Schneider
2020-01-09 10:52           ` Morten Rasmussen
2020-01-12 13:22             ` Valentin Schneider
2020-01-13 13:22               ` Morten Rasmussen
2020-01-02 13:59         ` Sudeep Holla

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=678F3D1BB717D949B966B68EAEB446ED340AEB67@dggemm526-mbx.china.huawei.com \
    --to=prime.zeng@hisilicon.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=morten.rasmussen@arm.com \
    --cc=rafael@kernel.org \
    --cc=sudeep.holla@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.